Symbolic Computation for Mainstream Verification

PI: Budi Arief, University of Kent. Co-Is: Sadie Macintrye, Tom Seed and Andy King, University of Kent

Close integration of symbolic computation with satisfiability modulo theories (SMT) has a lot of potential.
A new class of SMT solvers is beginning to emerge that apply powerful algebraic reasoning in the theory component of SMT solvers.
Moreover, the gap between what is provided by algebraic reasoning and what is needed for mainstream verification was recently bridged by a novel SMT solver that applies algebraic (Gröbner basis) reasoning over systems of polynomials modulo 2^w.

The overarching objective of this work was to migrate symbolic computation into mainstream verification by adapting algebraic reasoning from fields to bit-vectors. Thus far, work at this intersection has focussed on theories with importance in symbolic computation, over the fields of real or complex numbers. But arguably the ring of integers is more important for software verification.

To lead in this neglected area, this project developed Gröbner basis methods that dovetailed with the requirements of verification, namely SMT solving and abstract interpretation.

We developed a solver for putting systems of polynomials into a simplified form called a Groebner base. The solver uses novel techniques for factoring out redundant (S-polynomial) calculations. It applies so-called interreduction to compress the size of the Groebner base on-the-fly as polynomials are generated. Additionally, the solver exploits parallelism – both to simplify polynomials (reduce a polynomial against others in the Groebner base), and also to reduce a new polynomial against other new polynomials that are computed in parallel. The resulting solver is not only fast and scalable, but also it is unique in that it focusses on systems of polynomials modulo a power of 2.

We deployed the solver in the context of an abstract interpreter demonstrating how it can derive and verify invariants of algorithms that were previously out of reach of existing abstract interpretation frameworks. We have invented and implemented a novel join operator which combines two systems of polynomials that describe program state on the different branches of an if-then-else statement. Although this join operator was adequate for most benchmarks, we observed that it lost information unless the two systems were themselves put into a normal form.

We thus developed a divide-and-conquer algorithm for computing this normal form, which we have called a closed form, that partitioned the systems of polynomials into independent sub-systems of polynomials. These sub-systems are all syntactically simpler (are actually n-ary vectors of points described by n-ary vectors of polynomial which are parametric typically in just one of two variables rather than n). Join is then calculated between all these sub-systems.

We have found new ways of reducing the number of subsystems, by altering the divide-and-conquer algorithm to partition the space in a way that reflects where the systems of polynomials themselves have solutions. This gives an order of magnitude speedup on the closure calculation, hence the abstract interpreter as a whole.

This research is particularly timely since we are now seeing a new class of SMT solvers emerge, that apply powerful algebraic reasoning in the theory component of SMT solvers. The focus of this work has been to adapt classical algebraic algorithms to the new setting of SMT, typically by adding incrementality or learning. This will ensure software correctness, which in turn will improve security.

IMPACT. We now have a huge body of novel work, most of which is written up, complete with system studies and experimental results. The code base is itself large (40K LOC of Scala code) which applies a number of novel implementation and parallelisation techniques. Once the main framework has been published, we will follow that with explanations of the system itself, and the new technical mathematical techniques that we have devised and implemented on which the abstract interpreter rests. We are also in discussions with a government agency about how to deploy the abstract interpreter on microcontroller code.