Owner: @Alexander Mozeika
Reviewer: š¢@Marvin Jones š¢@Ćlvaro Castro-Castilla
Introduction
One of possible approaches to design a reliable anonymous communication (AC) system is to reduce statistical correlations between communicating nodes. Here we model a network of communicating nodes as a probabilistic discrete-state cellular automata (CA). We consider a node-centred approach where a node has associated with it variable representing its discrete state, such as sending, receiving, etc. Also we suggest a more granular connection-centred approach where discrete states of communication links of a node are considered. We note that message-centred approach is also possible but not pursued here. Finally, we discuss functions which can be used to quantify correlations in empirical analysis of AC systems.
The ācellular automataā (CA) model
- The system we consider is a network of communicating nodes where nodes are labelled by the set $[N]=\{1,\ldots,N\}$.
- We assume that nodes receive and send messages and these messages are indistinguishable, i.e. it is either impossible to observe bitstreams of messages, or incoming and outgoing messages are bitwise uncorrelated.
- The node $i\in[N]$ at time $t$ can be in the state of either sending (message) or receiving (message) or inactive, i.e. neither sending or receiving. The latter is modelled by the variable $S_i(t)\in\{-1,0,1\}$ as follows
$S_i(t)$ |
Node $i$ at time $t$ is |
-1 |
sending a message |
0 |
inactive |
1 |
receiving a message |
- We note that a node can be in more states, for example in addition to sending, receiving, and inactive it could have an additional state of simultaneous sending and receiving, i.e. āsend-receiveā state. Additional states c can be modelled by extending the alphabet from which $S_i(t)$ takes its values, i.e. $S_i(t)\in\{1,2,\ldots,q\}$ for the most general case.
- The vector $\mathbf{S}(t)=(S_1(t),\ldots, S_N(t))$ is the state of the network at time $t$ and for $t\in\{t_0,t_1,\ldots,t_F\}$, where $t_0<t_1<\ldots<t_F$, the (ordered by time) set of vectors $\mathbf{S}(t_0), \mathbf{S}(t_1),\ldots, \mathbf{S}(t_F)$ is the path, through the state-space $\{-1,0,1\}^N$, taken by the system from the time $t_0$ to the time $t_F$. The latter can be represented by a table (or matrix) as in the example below obtained from simulations.
![The state of the network as a function of time. The node $i\in [N]$ at time $t$, represented by dot, is either sending (red dot) or receiving (blue dots) or inactive (white dot). All $N$ nodes are sending messages through $k$ nodes with $k=3$.](https://prod-files-secure.s3.us-west-2.amazonaws.com/1518abd9-c08f-4989-93c1-96525e62bce5/12939053-ffb4-4900-a536-9604a2b8343f/Screenshot_2024-05-17_at_18.11.48.png)
The state of the network as a function of time. The node $i\in [N]$ at time $t$, represented by dot, is either sending (red dot) or receiving (blue dots) or inactive (white dot). All $N$ nodes are sending messages through $k$ nodes with $k=3$.
- Here we expect that dynamics of the network state $\mathbf{S}(t+\Delta t)$ is Markovian, i.e. depends only on $\mathbf{S}(t)$, and can be described by the probability $\mathrm{P}(\mathbf{S}(t))$. We note if the latter is factorises, i.e. $\mathrm{P}(\mathbf{S}(t))=\prod_{i=1}^N \mathrm{P}_i(\mathbf{S}(t))$, for all $t$ then nodes are uncorrelated and āobservingā any given node doesnāt reveal any information about the other node/nodes.
- To take this research route further would require to derive master equation for $\mathrm{P}(\mathbf{S}(t))$, to derive and analyse equations for correlation functions, etc.
Empirical analysis of correlations in CA model
$$
\delta_{S;S_i(t)}=\Big\{ \begin{array}{c}
1, S=S_i(t) \\
0, S\neq S_i(t)
\end{array}
$$
- The sum $\sum_{t\in \mathcal{T}}\delta_{S;S_i(t)}$ counts how many times node i was in state $s$ on the (ordered) set of times $\mathcal{T}=\{t_0,t_1,\ldots\}$, where $\vert\mathcal{T}\vert=T$. Additionally, the latter can be used to define the (empirical) frequency $\hat{P}i(S)=\frac{1}{T}\sum{t\in\mathcal{T}}\delta_{S;S_i(t)}$.
- The sum $\sum_{i=1}^N\delta_{S;S_i(t)}$ counts number of nodes in the network which are in state $s$ at time $t$ and can be used to define the (empirical) frequency $\hat{P}t(S)=\frac{1}{N}\sum{i=1}^N\delta_{S;S_i(t)}$.