Column-Chunk Sampling Scenarios

Introduction

This document analyzes five scenarios for data sampling and reconstruction in NomosDA, focusing on column sampling vs. chunk sampling and how changing the code structure (1-D RS, bivariate RS, or full two-dimensional expansion) affects:

Attack thresholds (how much data must be erased to break recoverability)
Sample counts required for light clients to detect unavailability with high probability
Per-DA node storage requirements
Bandwidth for DA nodes during sampling
Computation costs (especially for two-dimensional interpolation during decoding)

By systematically comparing these scenarios, it can be decided which layout offers the best trade-off between fault tolerance, sampling efficiency, and storage overhead.

1. NomosDA v1 – Column Sampling (Baseline)

Setup

Data arranged as $k\times2n$ matrix (32×2048 = 1 MB at rate 1/2).
Each column corresponds to one DA subnetwork.
Light clients sample entire columns at random.

Adversary Model

Must withhold at least $n+1$ full columns (~50%) to make recovery impossible.
Bad fraction: $f_{\text{bad}}=\frac{n+1}{2n}\approx0.5.$

Detection Probability

Light clients pick $s$ columns uniformly without replacement: $P_{\text{detect}}\approx1-(1/2)^s.$
20 columns → $(1/2)^{20}=1/1,048,576\approx9.5\times10^{-7}$ false accept (≈1 in 1.05 million)

Introduction

1. NomosDA v1 – Column Sampling (Baseline)

Setup

Adversary Model

Detection Probability

Per-DA Node Storage