Introduction

<aside> 💡

Details of mathematical derivations, with references to literature, and additional numerical results are in the Appendix.

</aside>

We consider anonymous communication system which uses a layered network of mix nodes, i.e. the mix network. We assume that nodes are used in the mix network are sampled from $N$ nodes of a network where $M$ nodes are adversarial. Presence of adversarial nodes in a communication path of the mix network can cause anonymity or communication failure. For anonymity failure to occur in a communication path all mix nodes on this path have to be adversarial and such path is compromised. In a mix network with $L$ layers naive reasoning (e.g. see the article) gives $(M/N)^L$ for the fraction of compromised paths. The latter is correct for a mix network of infinite width, but when the width is finite the fraction of compromised paths can be significantly larger than $(M/N)^L$. In this document we study statistical properties of the fraction compromised paths. Result of this work allows us to estimate the probability that the fraction of compromised paths is greater than some threshold. The latter can be used to optimise layered mix networks.

Assumptions

To construct the mixnet we sample $n$ nodes (without replacement) from the population of $N$ nodes.
At most $M$ , where $M<N$, nodes in the population are adversarial.
The number of adversarial nodes $m$ in the mixnet is random number from the hypergeometric probability distribution $P\left(m\vert N,n,M\right)=\frac{{n\choose m}{N-n\choose M-m}}{{N\choose M}}$
We have $L$ layers constructed from $n=\sum_{\mu=1}^L n_\mu$ nodes, where $n_\mu$ is the number of nodes in the layer $\mu$.
The probability that the number of adversarial nodes in layer 1 is $m_1$, in layer 2 is $m_2$, etc., is given by the multivariate hypergeometric distribution $P\left(m_{1},\ldots, m_{L}\vert n_{1},\ldots,n_{L}; m\right)=\frac{\delta_{m;\sum_{\mu=1}^L m_{\mu}}\prod_{\mu=1}^L {n_\mu\choose m_\mu}}{{n\choose m}},$
```
  where $\\delta_{x;y}$ in above is the [Kronecker delta function](<https://mathworld.wolfram.com/KroneckerDelta.html>). 
```
From above follows the joint probability distribution $P\left(m_{1},\ldots, m_{L}\vert n_{1},\ldots,n_{L}\right)=\sum_{m=0}^nP\left(m_{1},\ldots, m_{L}\vert n_{1},\ldots,n_{L}; m\right)P\left(m\vert N,n,M\right)$

Failures

Anonymity failure occurs in the mix network when at least one adversarial node is present in each layer thus allowing to form a path of compromised nodes.
The probability of anonymity failure is given by $P\left(m_{1}\geq1,\ldots, m_{L}\geq1\vert n_{1},\ldots,n_{L}\right)$

Screenshot 2024-03-04 at 12.49.16.png

Communication failure occurs in the mix network when the number of adversarial nodes is greater than some fraction $A$ of nodes in at least one layer. A layer with all (or a large number) of adversarial (or honest but with limited resources) nodes could prevent (or considerably slow down) communication between users.
We are interested in probability of the event $\cup_{\mu=1}^L\{m_\mu\geq \lfloor An_\mu\rfloor\}$, i.e. in at least one layer of the mix network the number of adversarial nodes is greater than some fraction $A\in(0,1]$ of nodes in that layer.