PI: Vincent Rahli, Co-I: David Parker, University of Birmingham

  • Distributed critical information infrastructures need to behave correctly and in a timely fashion, while tolerating faults and attacks.
  • We use the Coq proof assistant to prove the correctness of such systems, in particular their timeliness.
  • We are developing models and proof methods to verify timing properties of such systems.
  • We are using these tools to verify correctness of real-time Byzantine fault-tolerant broadcast protocols.

Our society strongly depends on distributed critical information infrastructures such as electrical grids, autonomous vehicles, blockchain applications, IoT critical infrastructures, etc.. Such systems need to behave correctly, even in the face of faults and attacks. For time-critical systems, one correctness criteria is that operations are performed within a certain time bound. For example, in the context of a SCADA system, a desirable property is that actuators receive commands reliably and punctually.

One state-of-the-art technique to ensure that services in general are secure against faults and attacks is to use Byzantine Fault Tolerant State Machine Replication (BFT-SMR) that replicates a service and masks faults behind the behaviour of correct replicas. Some protocols have been proposed that can provide timing guarantees in synchronous environments (where message transmission delays are bounded by a known bound), while others work in asynchronous environments (where message transmissions are unbounded) but do not provide timing guarantees.

We have developed in [1] a collection of protocols that provide timing guarantees in probabilistic synchronous networks, where messages have a low probability of being delivered late. We have informally proved that these protocols satisfy timeliness properties, in addition to standard BFT properties.

Many formal theory tools have been developed to verify the correctness of BFT-SMR protocols. However, none  support the kind of protocols developed in [1]. Therefore, as part of this project we are developing within the Coq theorem prover, models and verification techniques to guarantee the correctness of real-time BFT-SMR protocols, and in particular, to prove timeliness properties in probabilistic synchronous environments.

Towards this goal we have developed a probabilistic model of distributed computations, where messages are assigned a probability of getting lost or delayed, and where a faction of the nodes can behave arbitrarily (i.e., are Byzantine).

We are now using this model to verify properties of distributed systems, some of which only hold with high probability. In  particular, we are now using it to verify properties of the real-time Byzantine fault tolerant protocols developed in [1], such as timeliness.

PUBLICATIONS. D. Kozhaya, J. Decouchant, V. Rahli, and P. Verissimo. “PISTIS: An Event-Triggered Real-Time Byzantine-Resilient Protocol Suite.“, IEEE Transactions on Parallel and Distributed Systems, 2021