Bulletproofs: Faster Rangeproofs and Much More
Blockstream Research

Bulletproofs: Faster Rangeproofs and Much More

Andrew Poelstra

In 2015 we announced Confidential Transactions(CT) as a headlining feature of Elements. This feature replaced transaction amounts with Pedersen commitments, a cryptographic tool to hide the amounts while retaining the ability for anyone to verify that the amounts balanced within any particular transaction.

A major difficulty with CT was that it made the transactions very large and slow to verify, because it required each transaction output to contain a rangeproof, a type of zero-knowledge proof that proves amounts are too small to overflow. While an ordinary digital signature is less than 100 bytes and takes less than 100 microseconds to verify, these rangeproofs were several kilobytes in size and took several milliseconds to verify. In practice, rangeproofs were the vast majority of transaction data in any transaction that used them.

Though our rangeproofs, based on Borromean ring signatures, were the fastest and smallest in the literature — for the range sizes we needed and without a trusted setup — they were still quite large.

Since 2015, there have been efforts to improve the efficiency of rangeproofs. In early 2017, Adam Back found a 24% reduction in rangeproof sizethough with no verification performance improvement. Around this time, we had mentioned the problem to our friends and fellow cryptographers Dan Boneh and Benedikt Bünz at Stanford, who felt pretty confident there was room for improvement.

What they eventually came up with astonished us.

Smaller, Faster, Stronger

Based on a 2016 improvement in the space efficiency of discrete-log based zero knowledge proofs from Bootle et al, Bulletproofs are an even more space efficient form of zero-knowledge proof. Importantly for our purposes, these proofs also have native support for committed values such as Pedersen commitments and public keys. This lets us implement things such as rangeproofs in this general zero-knowledge framework without implementing the heavy machinery of elliptic curve arithmetic in zero knowledge.

Stronger. To limit the size of transactions, our old rangeproofs constrained outputs to be in a range of size 2^32. This limited outputs to about 43 BTC, though this could be increased by reducing the granularity of proofs from 1 satoshi to 10 or 100, or by increasing the minimum value from zero. These adjustments were possible but used explicit amounts, limiting the privacy provided by the system.

These 32-bit rangeproofs were approximately 2.5 KiB in size. With Adam’s optimization they would have been 2 KiB in size. With Bulletproofs, they would have been 610 bytes. With such small sizes, we might as well double the range to 64 bits, eliminating the need for any non-private adjustments. This would increase the paltry 610 bytes to 1220, right? Nope. In fact, a 64-bit Bulletproof rangeproof is only 674 bytes.

Smaller. The reason that we can double the range size while adding only 64 bytes to the proof size is that they grow logarithmically in size. This is done using a variant of a the inner product argument from the Bootle et al 2016 paper. (Jonathan Bootle also helped Benedikt and Dan develop Bulletproofs). Specifically, the logarithmically-sized inner product argument described in that paper was reduced even further in Bulletproofs from 6log(N) curvepoints to 2log(N).

This same trick allows aggregation of multiple rangeproofs within a transaction into one, again with only a small size increase. An aggregate of two rangeproofs would be 738 bytes, of four would be 802, and of eight would be 866. Eight 64-bit classical rangeproofs would be over 40000 bytes!

Faster. This space savings is great, but our initial analysis of the technique showed that verification would be slower than the old rangeproofs. It appeared that verification of a single 64-bit proof would require more than 200 scalar-point multiplications, each one an onerous 50-microsecond affair, while the old rangeproofs needed only 128.

But after further analysis, we were able to combine many of the multiplications, reducing the total number to only 147. More importantly, we realized that unlike the old rangeproofs, none of these multiplications depended on each other, and we could do them all in one big batch. As part of our work on aggregate signatures, we knew how to batch-multiply very quickly. Pieter Wuille, Greg Maxwell, Jonas Nick, Peter Dettman and myself had spent several months on this problem and had reduced the speed of 147 multiplications to only 15.5 microseconds each, getting the total verification time of a Bulletproof down to 2.3 ms, vs 5.8 ms for the old proofs.

This is already more than a doubling in speed, but because our batch-multiplication gets faster the more points you give it, the performance numbers for aggregates are even more impressive. An aggregate of eight 64-bit Bulletproofs can be verified in only 10.7 ms, vs 46.5 ms for the old proofs, more than quadrupling the speed.

But it gets even better. Bulletproofs support an extremely efficient form of batch verification. Of the 147 multiplications we need to do, 130 of them use the same points in every Bulletproof, which means that during batch validation, these 130 multiplications can be combined, leaving only 17 new ones. In fact, this marginal cost increases only logarithmically: for aggregates of 2 ranges, each additional proof takes 19 extra points, and for aggregates of 4 each proof takes 21.

Note that we have introduced two similar but separate concepts: aggregation is when a prover combines multiple rangeproofs into one, while batching is when a verifier checks multiple separate proofs at once.

This means that two 64-bit rangeproofs can be validated in under 2.7 ms, or 1.3 ms per range. A thousand rangeproofs can be validated in 239 ms, or 0.24 ms per range, which is a 23x improvement over the old proofs. But it still gets better, because of aggregation. A thousand 8-fold aggregates (so 8000 ranges total) can be validated in 588 ms, or 74 microseconds per range, or a 78x improvement over the old rangeproofs.

This effect eventually maxes out around aggregates of 64 proofs, as the increasingly efficient scalar-point multiplications stop being a dominant effect. At this point we can batch-validate in under 49 microseconds per range, a 120jjimprovement. For reference, an Elliptic Curve Digital Signature Algorithm (ECDSA) signature takes approximately 49 microseconds, so at this level of aggregation, the rangeproof is not even the dominant part of transaction validation. Of course, we are unlikely to see many 64-output transactions on the blockchain, but this speed is certainly possible in non-blockchain contexts such as Provisions.

This verification is also memory-efficient, taking under 100 KiB to validate a single rangeproof, and scaling linearly with the size.

You Can Prove Anything (If It’s True)

Bulletproofs are much more general than rangeproofs. They can be used to prove, in zero knowledge, arbitrary statements. They are equivalent in power to SNARKsor STARKs, but have native support for elliptic curve (EC) public keys and Pedersen commitments (so there is typically no need to implement EC math inside a program being proven). Also, unlike SNARKs, Bulletproofs have full 128-bit security under standard assumptions with no trusted setup. Unlike STARKs, they are fast enough to prove and verify reasonably-sized problems on typical computing hardware.

As a specific example, consider a single run of the SHA2 compression function. Our prover requires less than 30 MiB of memory and about 13 seconds to prove knowledge of a SHA2 preimage. Verification takes about 23 MiB of memory and 435 ms, but we can batch verify additional proofs in about 28 ms and 13.4 KiB each. The resulting proof is only 1379 bytes.

Our prover is more memory-efficient than that for SNARKs: on the same system, a SNARK proof for SHA2 takes only 4 seconds but 75 MiB of memory. Verification, while requiring a large one-time precomputation for each circuit (description of the statement to be proven), then takes only 3-5 ms and very little memory to verify. These numbers do not increase with the size of the circuit, so for more than a few thousand gates, SNARKs are the clear winner even against our batch validation. Unfortunately, this comes at the cost of a trusted setup and novel cryptographic assumptions.

There remains significant room for additional optimization of Bulletproofs, in both the prover and verifier.

This power to prove arbitrary statements — whether by Bulletproofs, SNARKs or STARKs — has many applications. It can be used to implement ordinary digital signatures, including (traceable) ring signatures and threshold signatures — which for large rings is likely to be more efficient than traditional schemes, both in verification time and proof size — but it is not limited to this. It can be used to trustlessly sell Sudoku problems. It can be used in multiparty computations to prove that each party was acting honestly, even with secret data. (In particular, during multisigning schemes such as MuSig, this allows the use of deterministic nonce generation without requiring signers maintain state or be susceptible to nonce-reuse attacks.) It can be used to prove hash preimages.

This latter application, hash preimages, is especially interesting because it can be leveraged to create zero-knowledge Merkle proofs, efficient proofs of inclusion in massive sets (with millions or even billions of elements). We will explore this more in a future post.

Conclusion

  • Bulletproofs are general zero-knowledge proofs (like SNARKs)
  • They can be used to extend multiparty protocols such as multisignatures or Zero-Knowledge Contingent Payments
  • Bulletproofs provide a much more efficient version of CT rangeproofs (when batch verifying, over 23x speed improvement)
  • These rangeproofs can be aggregated within transactions with only logarithmic size increase.
  • With sufficient aggregation, such as in Provisions, batch verification becomes over 120x as fast as the old proofs.

We would like to thank Bootle et al. for developing the inner product argument that led to all of this, as well as Benedikt Bünz and Dan Boneh, our coauthors, who did the bulk of the inventive work. We would also like to thank Sean Bowe and Daira Hopwood for their research on optimizing arithmetic circuits.

All computations were done on a single core of an Intel i7-6820HQ CPU at 2.70GHz.

If you have specific preferences, please, mark the topic(s) you would like to read: