Owner: @Álvaro Castro-Castilla

<aside> ℹ️ TLDR; If not interested in the background or alternative solutions, jump to solution 2, but keep in mind that its viability is not fully determined yet.

</aside>

The Challenge of Efficient Private Addressing

As we know by this time, privacy in PoS systems is not a trivial problem. Currently no PoS blockchain provides full stake privacy. Timing attacks and network observation strategies that aren't that hard or expensive to perform allow creating fairly accurate maps of the stake distribution in the network. This is essentially what we are trying to prevent with this protocol. The problem needs to be tackled holistically, emphasizing that the network cannot hide all identifiable traffic patterns unless a very high cost is incurred (rendering our target requirements for validators unachievable).

Part of the problem is tackled in the previous document. Mixnets and to a limited extent, broadcasting, help with sender/receiver unlinkability, which is another important component for privacy preservation in PPoS. Additionally, mixnets introduce noise (cover traffic) that helps by smoothing out otherwise distinct communication patterns between specific roles in the protocol (although as mentioned above can become prohibitively expensive if done naively). However, the use of a Mixnet does not solve the problem of private addressing. This boils down to the difference between private routing and private addressing:

Private routing is hiding which node is the destination of a message.
Private addressing is hiding which validators are in each node.

The Limitations of Mixnets

From the definition of private routing vs addressing, we can derive what mixnets can and cannot do with more clarity:

The amount of real ingress/egress traffic of every node should be unobservable, without incurring in very high cost in terms of cover traffic. Mixnets have no way to regulate, optimally for both privacy and performance, the amount of cover traffic for protocols using it. A typical consensus protocol implementation has a message complexity per node at least proportional to the number of validators it hosts (if we have a flat p2p network). This means that the amount of cover traffic of the entire network is proportional to the node with the highest number of validators. Hence, with this approach home validators will be left behind by the ever-rising bar of professional node operators. This is antithetical to our principles of decentralization.
Private addressing is not solved by client/server mixnets, since the exit node always must know where to deliver the message. This easily leaks protocol behavior and stake information to the mixnet exit nodes, which could be malicious. Moreover, the problem of addressing is general in nature: if any single entity has access to the information of which specific node it needs to send a message to, this leaks enough information about nodes that can be used over time to build a view of the validator → node mapping. So even if we devise a better mixnet with no exit nodes (by for example eliminating the distinction between destination and last hop of the mixnet), it won't suffice. To see how this always holds, notice that all these situations leak enough information to an interested observer:
- If the sender is the one choosing the route, it must first know where each validator is in the network. This would require this information to be public. Even in sophisticated addressing schemes like Kademlia, the addressing is public.
- If the exit node of the mixnet does the message addressing, it must know the mapping. The use of external mixnets “solves” this by just externalizing this information so the two systems are supposed to not have interest in each other. This is not sufficient for our needs because we intend our mixnet to be economically integrated in our system (whether it's embedded in the Nomos nodes or not).
- Any other entity, even if it's just a relayer, will have access to this information if it has to actively route.
In short, private addressing requires the network to be able to deliver a message without nobody (except the receiver) knowing which node is the actual receiver.

How to Sidestep these Limitations

The two limitations above have to be solved in order to provide the foundations for Private PoS. While the problem of private addressing could perhaps be solved at the network level (even if naively and inefficiently, with broadcasting), the problem of high cover traffic is inherent to Mixnets, and doesn't seem to have a feasible solution by means only of network architectures. The reason is because under an omniscient observer adversarial model, we will always have to match the total traffic of the node with the highest, and a mixnet has no control over what each node produces. We need the protocols that we want to protect at the network level to collaborate with the network design.

The solution that we will explore here for these problems relies on the following idea: the network should operate as ORAM (Oblivious Random Access Memory). More specifically:

The amount of ingress/egress traffic should be equivalent for all nodes, or at least close enough that it cannot be analyzed to discover the nodes doing the work of multiple validators. This traffic should be mixnet-friendly, and make use of the mixnet properties to avoid increasing the amount of cover traffic to untenable levels.
Queries should be oblivious. No entity in the network should be able to identify any particular access pattern (whether for reads or writes) to the data. Such patterns could leak information on what accesses are real or cover traffic, rendering the previous technique useless.

Both sides must work together to solve the problem of private addressing. To make this possible we will be combining these tools:

Measure and establish the maximum observable activity across all nodes that given protocol produces, and ensure that this activity is equalized among all nodes. In the case of Carnot, this means analyzing the different roles in all of the protocol scenarios, and identifying the node that will produce the most traffic. This could initially be estimated analytically, but ultimately experimentation will be necessary. For reference:
1. Egress traffic: $\alpha = \bigcup_{i \in I} \alpha_i$, where $\alpha_i$ is the max egress traffic (a set of messages $\alpha_i = \{a, b, ...\}$) expected through a single role in a unit of time (to be defined based on mixnet delays). In other words, the observable traffic (real + cover) for every single node should be comprised of the same set of messages per unit of time, regardless of them being cover or real. This should account for intra-committee traffic, as well as inter-committee traffic.
2. Ingress traffic: $\beta = \bigcup_{i \in I} \beta_i$, where $\beta_i$ is the max ingress traffic instead.
Collapse multicast ingress/egress messages. We need to devise a mechanism that allows any node to obliviously receive (ie without leaking that this is happening to the sender or a relayer) a single message for all N validators, instead of N messages, in the specific case in which these messages were the same, and were effectively muticasted to them. This applies to us specifically in the case of committee communication. This does not apply to globally-broadcasted messages, since these is the only type of messages that does not require routing and consequently does not leak anything. To further clarify, what we are trying to do is to be able to somehow avoid sending more than one same message to a node. This scenario would happen when the same message is sent to more than one node, but not to all, like in a committee. Moreover, the sender cannot know the validator → node mapping, so it's unable to aggregate those. This is the main challenge to solve. Egress traffic is the opposite situation: when the node needs to multicast a message the protocol must ensure that the outgoing traffic produced is 1, not N. A way to see this is as a generalized pub/sub mechanism, in which a producer only needs to send a single message for an entire topic (which acts as a multicast group), and a subscriber only needs a single message.