Data Cheat Sheet

Blobs and Samples Sizes

Blob Data is represented as blob_size bytes.
These bytes are interpreted as scalar field elements of the BLS12-381 elliptic curve, by batch of 31 bytes. This means that blob_elements = blob_size / 31.
We use a Reed-Solomon encoding with a $1/2$ rate. That is encoded_elements = 2 x blob_elements.
There is 2048 subnetworks, so the encoded data must be equally distributed in 2048 encoded samples. One sample is sample_size = encoded_elements / 2048 = blob_size / 1024, a sample is composed of one element of each code word. Moreover, we will then assume that blob_size is a power of 2.
The encoding is then done on 2048-element code words. To be proven, it implies one KZG commitment per encoded word, the number of KZG commitments is then equal to the sample_size.
The encoding is proven with KZG and each sample has a unique aggregated KZG proof.
The encoding is verified using KZG. Both the KZG commitment and proof are G1 elements, each 48 bytes when compressed.
One sample and it’s verification material is then composed of sample_elements = sample_size/31 scalar elements call, sample_elements commitments and 1 proof, for a total of sample_elements x (31 + 48) + 48 bytes.

At least strictly more than 50% of the subnetworks have at least one honest node.
If an adversary withhold more than 50% (we cannot decode) of the data after 20 sampling we detect it with probability 1-2^-20. Each consensus node does it.

Maximum data storage of DA nodes.
Maximum bandwidth of DA nodes dedicated to DA. It includes reception from encoders but also answering sampling requests from consensus node and light nodes.
Maximum execution capacity of the encoder.
Maximum execution capacity of the consensus node that validate the samples.