Concurrent and distributed protocols can be formally viewed as a set of logical actions, each of which symbolically describe allowed state transitions of the system. We can analyze the structure of a protocol’s actions to understand the interaction between them, and to reason about a protocol’s underlying compositional structure.

One approach to decomposing a protocol into subcomponents is to break up its actions into disjoint subsets, and view each disjoint subset of actions as a separate logical component. This is a useful starting point for decomposition of protocols since actions represent the atomic units of concurrent behavior within a protocol specification. We can also use this basic type of decomposition to define various formal notions of interaction between individual actions or subcomponents of a protocol.

As a simple example, consider the following protocol specification:

\[\small \begin{align*} &\text{VARIABLES } a,b,c \\[0.4em] &{Init } \triangleq \\ & \quad \land \, a = 0 \\ & \quad \land \, b = 0 \\ & \quad \land \, c = 0 \\[0.4em] & IncrementA \triangleq \\ & \quad \land \, b = 0 \\ & \quad \land \, a' = a + b \\ &\quad \land {\text{UNCHANGED }} \langle b,c \rangle \\[0.4em] & IncrementB \triangleq \\ & \quad \land \, b' = b + c \\ & \quad \land \, \text{UNCHANGED } \langle a,c \rangle \\[0.4em] & IncrementC \triangleq \\ &\quad \land \, c < cycle \\ &\quad \land \, c' = (c + 1) \% cycle \\ &\quad \land \, \text{UNCHANGED } \langle a,b \rangle \\[0.4em] &Next \triangleq \\ &\quad \lor IncrementA \\ &\quad \lor IncrementB \\ &\quad \lor IncrementC \\ \\ &Inv \triangleq a \in \{0,1\} \quad \text{(* top-level invariant. *)} \\ & L1 \triangleq b \in \{0,1\} \\ & L2 \triangleq c \in \{0,1\} \end{align*}\]

In this case, we can consider decomposing the protocol into 2 logical sub-components,

\[\begin{align*} M_1 &= \{IncrementA\} & \qquad Vars(M_1)=\{a,b\} \\ M_2 &= \{IncrementB, IncrementC\} & \qquad Vars(M_2)=\{b,c\} \end{align*}\]

with the state variables associated with each component.

In this case, it is clear that the logical interaction between \(M_1\) and \(M_2\) can be defined in terms of their single shared variable, \(b\). Furthermore, this interaction is “uni-directional” in terms of the data flow between components i.e. only \(M_1\) reads from \(b\) and only \(M_2\) writes to \(b\). In this simple case of interaction it is also clear that, for example, verification of \(M_1\) behavior’s should only depend on the behavior of the interaction variable \(b\). The full behavior of \(M_2\) is irrelevant to the behavior of \(M_1\), enabling a natural type of compositional verification.

More generally, if we consider every action of a protocol as its own, fine-grained component, with associated read/write variables, we can check pairwise interactions between all actions of an original protocol to produce an interaction graph, as shown below. This then serves as a starting point for understanding the interaction between protocol actions and the potential boundaries for protocol decomposition.

Two Phase Commit Protocol Interaction Graph

As another example, we can consider this simplified consensus protocol for selecting a value among a set of nodes via a simple leader election protocol. There are 5 actions of this protocol, related to nodes sending out votes for a leader, a leader processing those votes, getting electing as a leader, and a leader deciding on a value. We can examine this protocol’s interaction graph as follows:

Simple Consensus Protocol Interaction Graph

Figure 1. Interaction graph for simple consensus protocol.

Here, we see its interaction graph admits a simple, acyclic structure, with uni-directional dataflow between nearly all actions.

We can see another example of an interaction graph, for the two-phase commit protocol, based on its formal specification here:

Two Phase Commit Protocol Interaction Graph

Figure 2. Two phase commit protocol interaction graph.

This interaction graph, annotated with the interaction variables along its edges, makes explicit the logical dataflow between actions of the protocol, and also suggests natural action groupings for decomposition. Specifically, into the resource manager (\(RM\)) sub-component and the transaction manager (\(TM\)) sub-component i.e.

\[\small \begin{align*} &RM = \{RMRcvAbortMsg, RMRcvCommitMsg, RMPrepare, RMChooseToAbort\} \\ &TM = \{TMRcvPrepare, TMAbort, TMCommit\} \end{align*}\]

Two Phase Commit Protocol Interaction Graph

Figure 3. Two phase commit protocol interaction graph from Figure 2 with partitioned components shown.

For example, we can note that the only outgoing dataflow from the \(RM\) set of actions is via the \(msgsPrepared\) variable, which is read via \(TMRcvPrepare\). The only incoming dataflow to the resource manager sub-component is via the \(msgsAbort\) and \(msgsCommit\) variables, which are written to by the transaction manager.

This matches our intuitive notions of the protocol whereby the resource manager and transaction manager behave as logically separate processes, and only interact via the relevant message channels (\(msgsAbort\), \(msgsCommit\), and \(msgsPrepared\)).

Compositional Verification

The decomposition concepts above provide a way to view a protocol in terms of how its fine-grained atomic sub-components interact. We can, in some cases, utilize this structure for a kind of compositional verification when a protocol’s interaction graph is amenable.

Simple Consensus Protocol

For example, we can consider the interaction graph of the simple consensus protocol from above. Its mostly acyclic interaction graph (Figure 1) makes it directly amenable to a simple form of efficient, compositional verification. If we want to verify the core safety property of this protocol, \(NoConflictingValues\), which states that no two nodes decide on distinct values, we can check this with the TLC model checker in a few seconds using a model with 3 nodes (\(Node=\{n1,n2,n3\}\)), generating a reachable state space with 110,464 states.

From the protocol’s interaction graph, however, it is easy to see that the actions \(\{SendRequestVote, SendVote\}\), operate independently from the rest of the protocol, interacting only via writes to the \(vote\_msg\) variable. So, one approach to verifying this protocol is to start by verifying the \(\{SendRequestVote, SendVote\}\) actions independently of the rest of the protocol, and then verify the rest of the protocol against this behavior. More specifically, the overall protocol only depends on the observable behavior of this \(\{SendRequestVote, SendVote\}\) sub-component with respect to the \(vote\_msg\) variable.

For example, if we model check the protocol with the pruned transition relation of

\[\begin{align*} &Next_A \triangleq \\ & \quad \vee SendRequestVote\\ & \quad \vee SendVote \\ \end{align*}\]

we generate 16,128 distinct reachable states, a ~7x reduction from the full state space. Now, since the only “interaction variable” between this \(Next_A\) sub-protocol and the rest of the protocol is the \(vote\_msg\) variable, we could project the state space of \(Next_A\) to the \(vote\_msg\) variable and verify the rest of the protocol against this projected state space.

With an explicit state model checker, we could directly compute this projection by generating and projecting the full state graph, and using this projected state graph as the “environment” under which to verify the rest of the protocol. Alternatively, we can come up with an abstraction of the \(Next_A\) protocol that reflects the external behavior of the interaction variable \(vote\_msg\) adequately.

For example, consider the following abstract model over the single \(vote\_msg\) variable that logically merges the \(SendRequestVote\) and \(SendVote\) actions into one atomic action:

\[\begin{align*} &SendRequestVote\_SendVote(src, dst) \triangleq \\ &\quad \wedge \, \nexists m \in vote\_msg : m[1] = src \\ &\quad \wedge \, vote\_msg' = vote\_msg \cup \{\langle src,dst \rangle\} % &\quad \wedge \, \text{UNCHANGED } \langle vote\_request\_msg, voted, votes, leader, decided \rangle\\ \end{align*}\]

This atomic action adds a new message into \(vote\_msg\) only if no existing node has already put such a message into \(vote\_msg\) (i.e. since nodes can’t vote twice in the original protocol).

We can formally check that this is a valid abstraction of the \(Next_A\) sub-protocol by showing a refinement between them e.g. showing that every behavior of \(Next_A\) is a valid behavior of this abstract spec:

\[(Init \wedge \square [Next_A]_{vars}) \Rightarrow (Init \wedge \square [SendRequestVote\_SendVote]_{vote\_msg})\]

Verifying this refinement is one way of ensuring that the abstract spec preserves the “externally observable” transitions of this sub-component (e.g. with respect to the \(vote\_msg\) variable).

Due to the acyclic nature of this protocol’s interaction graph, we could continue applying this compositional rule to further accelerate verification, but even with this initial reduction, we can see significant improvement. That is, now that we have an abstraction of the \(\{SendRequestVote, SendVote\}\) sub-protocol that preserves its interactions with the rest of the protocol, we can try verifying the rest of the protocol against this abstraction e.g.

\[\begin{align*} & Next_B \triangleq \\ &\quad \vee \wedge \exists i,j \in Node : SendRequestVote\_SendVote(i,j) \\ &\quad \quad \wedge \text{UNCHANGED } \langle vote\_request\_msg,voted,votes,leader,decided \rangle \\ &\quad \vee \exists i,j \in Node : RecvVote(i,j) \\ &\quad \vee \exists i \in Node, Q \in Quorum : BecomeLeader(i,Q) \\ &\quad \vee \exists i \in Node, v \in Value : Decide(i,v) \end{align*}\]

Model checking the above protocol (\(Next_B\)) with TLC, produces 514 distinct reachable states, a > 200x reduction from the original state space.

So, in this case, with only a simple dataflow/interaction analysis, we were able to reduce the largest model checking problem by a factor of ~10x e.g. in this case model checking of the \(Next_A\) sub-protocol was the most expensive verification sub-problem.

Conclusions

The ideas and techniques discussed above are similar to various types of compositional verification techniques that have been applied in various contexts. Similar ideas are utilized in the “interaction preserving abstraction” techniques in this paper, and also in the work on recomposition, which builds similar techniques within the TLC model checker. The concept of using dataflow to analyze distributed and concurrent protocols has also appeared in various works in the past (e.g. distributed data flow), and also more recent work on using a Datalog like variant to automatically optimize distributed protocols using pre-defined rewrite rules.

Note that the code used to model the protocols above and generate their associated interaction graphs can be found here.