Cryptography Acceleration
RISC Zero’s rv32im implementation includes a number of specialized extension circuits, including two “accelerators” for cryptographic functions: SHA-256 and 256-bit modular multiplication, referred to as "bigint" multiplication. By implementing these operations directly in the “hardware” of the zkVM, programs that use these accelerators execute faster and can be proven with significantly less resources 1.
Accelerated Crates
The SHA-256 and bigint accelerators are currently integrated in "accelerated" versions of popular cryptographic Rust crates.
These crates include:
- RustCrypto's
crypto-bigint
crate - RustCrypto's
sha2
crate - RustCrypto's
k256
crate - Dalek Cryptography's curve25519-dalek crate
Each of these are forks of the original source code repository, with modifications to use RISC Zero cryptography extensions.
Using Accelerated Crates
When using any of the crates listed above directly, specifying the dependency as a git dependency. For example:
[dependencies]
sha2 = { git = "https://github.com/risc0/RustCrypto-hashes", tag = "sha2-v0.10.6-risczero.0" }
When using cryptography indirectly, e.g. via the cookie
, oauth2
, or revm
, crates it may be possible to enable acceleration support without code changes by applying a Cargo patch system.
An example of how to use these crates to accelerate ECDSA signature verification can be in the ECDSA example. Note the use of the patched versions of sha2
, crypto-bigint
and k256
crates used in the guest's Cargo.toml
.
Adding Accelerator Support To Crates
It's possible to add accelerator support for your own crates.
An example of how to do this can be found in this diff of RISC Zero's k256 crate fork, which shows the code changes needed to accelerate RustCrypto's secp256k1 ECDSA library. This fork starts from the base implementation, and changes the core operations to use the accelerated 256-bit modular multiplication instruction. E.g. FieldElement8x32R0::mul
.
- This is similar to the cryptography support such as AES-NI or the SHA extensions for x86 processors. In both cases, the circuitry is extended to compute otherwise expensive operations in fewer instruction cycles.↩