Context and motivation

A Groth16 batch-proving optimization was implemented under the assumption that most witness entries remain unchanged across proofs (only a small “index-dependent” slice varies). In our PoQ setting, the witness differences are limited to a small set of indices, meaning that ~95% of witness scalars are identical across proofs.

Since the dominant components of Groth16 proof generation include large multi-scalar multiplications (MSMs) that are linear in the witness, the initial expectation was that we could reuse ~95% of the MSM work by caching the contribution of the invariant witness portion and recomputing only a small MSM “delta” per proof. Under that intuition, additional proofs would mostly cost a few group additions plus fresh randomizers, suggesting a potentially large speedup.

In practice, the observed improvement (e.g., ~2×–3×) can be significantly smaller than what the “% of witness unchanged” or “% MSM reusable” might suggest. This note explains the mathematical reason: while some MSMs are linear and reusable, Groth16 proving also includes a nonlinear QAP/quotient computation whose dominant work remains essentially per-proof.

What Groth16 is really computing (R1CS → QAP)

Given an R1CS instance, Groth16 reduces satisfiability to a QAP identity:

$$ A(X)\cdot B(X) - C(X) \;=\; H(X)\cdot Z(X) $$

Key point (linearity vs nonlinearity)

Although $A(X),B(X),C(X)$ are linear in the witness $w$, the quotient

$$ H(X) = \frac{A(X)B(X) - C(X)}{Z(X)} $$

is not linear in witness $w$, due to the product term $A(X)B(X)$. This nonlinearity is the core reason why “reusing 95% of MSMs” does not automatically remove 95% of total prover cost.

The part that can be reused: linear MSMs

Groth16 proof elements contain linear combinations of witness scalars with fixed bases from the proving key, conceptually of the form:

If only a small set of witness indices $D$ changes between proofs, then the decomposition is exact: