I have been reading Vitalik's series on STARKs recently (Part 1, 2 and 3). It is a nice and very understandable read for a layman like me.
Brutal summary of my current understanding
Vitalik outlines the following technique to prove the correctness of some arbitrary computation:
Encode the computation trace in the values of a polynomial
P(x)
.Define a constraint checking polynomial
C(z)
such that, if the computation is correct, we will haveC(F(x)) = 0
for all values ofx
in the computation range. I will useQ
to denote the degree of the composed polynomialC(F(x))
.(In general,
C
can take multiple arguments, e.g.,C(z1, z2, z3) = z3 - z2 - z1
checks the correct computation of a Fibonacci sequence when we requireC(F(x), F(x + 1), F(x + 2)) = 0
).Because
C(F(x))
must be0
for allx
in1..N
, we defineZ(x) = (x - 1) (x - 2) .. (x - N)
:C(F(x))
must be equal toZ(x)
multiplied with someD(x)
.D
is relatively easy to compute fromC(F(_))
andP
.The prover commits to a table of the first
M
values of bothP
andD
, withM >> Q
(e.g.,M = 100 Q
).The verifier probes the table at multiple values
x
, checkingC(P(x)) = Z(x) D(x)
at various points to be convinced that the computation was executed correctly.
So far so good. I think I understand. Vitalik goes in detail explaining two specific checks that can be done with constraint-checking polynomials:
- Range-checking (
0 <= P(x) <= H
for allx
in1..N
, withC(z) = z (z - 1) (z - 2) .. (Z - H)
. - Fibonacci computation, with
C(z1, z2, z3) = z3 - z2 - z1
.
My random attempt at generalization
Sadly, Vitalik's explanation ends there, short of explaining how STARKs can be generalized to prove arbitrary computation. I have an intuition of how it could be done, but I don't see how it could be done efficiently.
For example, I thought we could define two "operand polynomials" L(x)
and R(x)
, then:
- Require
P(x)
being either0
or1
, for allx
in1..N
(this can be easily done by range checking, which Vitalik already explains how to do). - Additionally require
(1 - (P(L(x))P(R(x)))) - P(x) = 0
for allx
in1..N
. In doing so, we require eachP(x)
to be equal toP(L(x)) NAND P(R(x))
. Because any logic circuit can be built out ofNAND
s, a suitable choice ofL
andR
should make any computation provable.
The big issue with this idea, however, is that the degree Q
of the composition between P
and the constraint-checking polynomial here is huge, in the order of N^2
. Complexity for the prover would become unmanageable quite quickly.
Where do I read next?
I am wondering where I can read up how to define a constraint-checking polynomial that can actually verify any computation. The whole "accessing the memory at random locations" bit seems very tricky to me, and I'm eager to learn further.