chapter 5.6 https://arxiv.org/pdf/1809.09044

Setup and Assumptions

Original Data Structure:

Distribution:

Chunk Sampling Mechanism:

Column Sampling Mechanism:

Main formula: $P(X \geq 1) = 1 - \prod_{i=0}^{s-1} \left( 1 - \frac{\text{number of unavailable chunks+1}}{\text{number of whole chunks} - i} \right)$

Chunk Sampling Analysis

Unrecoverability Threshold

For chunk sampling:

If 50% of each row's chunks are unavailable:

Probability of Sampling an Unavailable Chunk

Using the formula:

$P_{\text{chunk}}(X \geq 1) = 1 - \prod_{i=0}^{s-1} \left( 1 - \frac{n+1}{k \cdot 2n - i} \right)$

Where: