Introduction
This document analyzes five scenarios for data sampling and reconstruction in NomosDA, focusing on column sampling vs. chunk sampling and how changing the code structure (1-D RS, bivariate RS, or full two-dimensional expansion) affects:
- Attack thresholds (how much data must be erased to break recoverability)
- Sample counts required for light clients to detect unavailability with high probability
- Per-DA node storage requirements
- Bandwidth for DA nodes during sampling
- Computation costs (especially for two-dimensional interpolation during decoding)
By systematically comparing these scenarios, it can be decided which layout offers the best trade-off between fault tolerance, sampling efficiency, and storage overhead.
1. NomosDA v1 – Column Sampling (Baseline)
Setup
- Data arranged as $k\times2n$ matrix (32×2048 = 1 MB at rate 1/2).
- Each column corresponds to one DA subnetwork.
- Light clients sample entire columns at random.
Adversary Model
- Must withhold at least $n+1$ full columns (~50%) to make recovery impossible.
- Bad fraction:
$f_{\text{bad}}=\frac{n+1}{2n}\approx0.5.$
Detection Probability
- Light clients pick $s$ columns uniformly without replacement:
$P_{\text{detect}}\approx1-(1/2)^s.$
- 20 columns → $(1/2)^{20}=1/1,048,576\approx9.5\times10^{-7}$ false accept (≈1 in 1.05 million)
Per-DA Node Storage