Proofs of Space (useful or not)

·

0 min read

Proofs of Work and their usefulness

gulag.jpeg

We all know about the concept of Proof of Work, Bitcoin being the most famous implementation. Proof of Work aims to show that the prover has certain computational power, or in other words - CPU. Bitcoin's PoW is actually "Proof of Useless Work", or "Proof of Wasted Work", or even "Proof of Energy". The computations of Bitcoin network do not try to derive any actual solution to any problem and can be discarded immediately after a block is mined. A step forward would be a "Proof of Useful Work", where computations serve some actual purpose. An example is Primecoin, where the network tries too look for large prime numbers. Participants still prove that they have computing power, but the results of computations don't go to waste.

Proofs of Space

galaxies.jpeg

Now we have a parallel idea in utilization of another resource: memory. These are called Proofs of Space. The prover aims to show that he has certain amount of memory. There are two obvious variants: Proof of Transient Space, aka "Proof of RAM" where the prover thas to have access to some amount of fast read-writable memory. Another variant would be Proof of Persistent Space, aka "Proof of Disk" where the data need to be written only once (and it doesn't need to be fast), but then they can be quickly read many times.

The cryptocurrency Filecoin employs Proof of Persistent Space. The prover first needs to write some data to disk, which can be very slow and take days - this is called the setup phase. Once it is done, the prover can quickly read the saved data. Special protocols ensure that he didn't generate the data ad hoc, but it is still the same file he created during the setup phase.

Just like Proof of Work - Proofs of Space can be useful or useless. Filecoin's proofs are of the useless variant. In fact, it is a "Proof of Unused Space", or "Proof of Wasted Space", or "Proof of Disk Capacity". The data are in fact written to unused partition space. On setup phase the unused partition space is filled with random data. If the user wants to reclaim this space and save some actual file, he must repeat the setup phase on the reduced capacity, which again takes days. On the positive side, it uses much less energy than Bitcoin's Proof of Work.

In Proofs of Space the limiting factor is the size of memory possession of we are trying to claim. Other than that they are very fast. They are not as CPU-hungry as Proofs of Work. In fact, when talking about the speed of generating a proof, the limiting factor is the memory access time. For RAM, reads and writes are fast, so the proof algorithm may use both of those operations. For disk, writes are slow and reads are relatively fast. Proof generation algorithms are split into a long setup phase that involves writes and then quick commitment phase which only reads data. In other words, provers can generate many Proofs of Space per second and are not time-limited.

Algorithms for PoSpace

Theory

To make a good Proof of Space we need a problem that exhibits very bad time-memory tradeoff. As we know, almost any problem that can be computed using some memory can also be solved using less memory - but more CPU. This is essentially the problem of caching. When we have computed a result of some function, we can save it to use later - which takes memory but saves CPU. Or we may disregard it and compute it again when we need it next time - which saves memory but takes more CPU.

To make a Proof of Space we need a problem where time-memory tradeoff makes saving memory impractical. Good news is such problems do exist. There are such algorithms that must call some subroutines many times. Now if we cache the results of those subroutines, the algorithm runs in polynomial time. If the algorithm doesn't cache them, it runs in exponential time. In the middle we have a situation when the algorithm tries to save only a bit of memory. Then we can measure a quality of our problem by checking how much more time the algorithm must use compared to the situation when it caches everything. The sharper the rise in CPU usage, the better.

Practice

pebbles.jpeg

One particular problem that is used in Proofs of Space is called "Graph Pebbling", or "Pebbling Game". It takes its name from a game where we place pebbles on a Directed Acyclic Graph and we can put a pebble on some vertex only when we placed pebbles on all its sources. Once used pebbles may be removed. We win when we reach the graph sink (or many sinks). Now each graph has a minimal number of pebbles that need to be used.

We can make an algorithm that is essentially a Pebbling Game where a pebble is a memory word and placing a pebble equates to hashing. We can calculate a hash for some vertex only if we know the hashes of all its sources. Since one pebble = one memory word, the minimal number of pebbles needed to solve a graph translates to the minimal number of memory words needed to run this algorithm. And this is the Proof of Space.

Graphs used in those algorithms must be selected from some family known to have the minimal necessary number of pebbles equal to the amount of memory we are trying to claim. Some graph families used in practical Proofs of Space are "superconcentrators" and "stacked expanders".

The operation of the algorithm starts with filling all the graph sources with some random data. Then for each vertex in the next layer we place a hash of the combined data from all its sources. Then the same for the next layer, until the end.

Now to generate a proof we just need to present some Merkle paths from our pebbled graph. The verifier can then check if the paths are valid and if they belong to the same graph.

This kind of Proof of Space algorithm is essentially a mode of operation of a hash function. We take a hash function and hash words of some data in certain way. As with any hashing we are able to prove that the hashed data were equal, but the original data itself is lost.

Useful data?

As we all know, any good encryption algorithm may be used as a hash function. This gives an idea to make a Proof of Space on some useful data.

We can use the standard AES cipher. First we fix some key. It may be made public, as it is used only for hashing. Then we use the same algorithm as previously, but whenever we need to hash something, we encrypt it with AES instead.

The resulting algorithm is the same and in the end we arrive at some ostensibly random data. But this time we used a cipher not a hash. We can essentially run a pebbling game backwards, this time using decryption. Finally we will restore the original data again.

As the previous algorithm was a mode of operation of a hash function, this one is a mode of operation of a cipher. The previous algorithm let us to hash some data and lose them in the process unless we save it somewhere wasting precious memory. This algorithm lets us encrypt some data in place, then decrypt. The encryption (and decryption too) takes the amount of memory proportional to the original file, so it can be used as Proof of Space. After we have generated the proof, we can decrypt the file and use it normally.

The key in this algorithm has to be made public, but this problem may be mitigated by using Public-Key Cryptography.

Summary

Proofs of Space may find their way in various security protocols like blockchains and cryptocurrencies. They are much more efficient in terms of energy. There is no benefit in using any ASIC, as they are relatively light on CPU. The distribution of memory among the people is much fairer than the distribution of computational power.

The drawback of Proofs of Space is that the proof size is larger than in Proof of Work. While the commitment in PoW is essentially one number, PoSpace need to present a set of Merkle paths. The size of the proof is still logarithmic in the size of the memory to claim.

In summary, Proofs of Space are a promising new technique to use in various cryptographic protocols.