Building Verified Applications In Cakeml

PIs: Olaf Chitil and Scott Owens, PhD Student: Hrutvik Kanabar, University of Kent

CakeML is a functional programming language and ecosystem of associated proofs and tools, including an end-to-end, formally verified compiler targeting several processor architectures.
CakeML lacks support for verifiably safe composition of user code with libraries that use “unsafe” features such as array accesses without bounds checks.
To integrate “unsafe” features we are building a semantic type system by applying the technique of step-indexed Kripke logical relations to a CakeML-style lambda calculus in the simply-typed logic of HOL4.
This is a three-year project supporting a PhD student, the results of which will be applied to CakeML itself.

CakeML is an ML-like programming language, intended to play a central role in trustworthy software systems. The CakeML project is an ongoing collaboration between researchers at the University of Kent (UK), Chalmers University of Technology (Sweden), and Data61 (Australia). The key contribution of the project is the first end-to-end verified, optimising compiler for a practical, functional programming language. However, CakeML lacks support for verifiably safe composition of user code with libraries that use “unsafe” features (such as array accesses without bounds checks).

We aim to solve this problem by producing a semantic type soundness result for CakeML, taking inspiration from the RustBelt project (Dreyer et al.). The latter aims to develop rigorous formal foundations for the Rust programming language, leveraging the Iris theorem-prover to verify that “unsafe” code (which opts out of Rust’s type system) from standard libraries is safely encapsulated, and does not violate the safety guarantees intended by the type system. The RustBelt team have already discovered and fixed subtle, significant bugs. Our previous VeTSS project found that the mathematical insights behind the Iris logic are not straightforwardly ported to the logic of HOL4 for use in CakeML.

The PhD student (H. Kanabar) built a System F-like calculus in HOL4 in the style of CakeML, and established a semantic type soundness result via a logical relation, identifying several application areas for technology once applied to CakeML. The introduction of general references to the logical relation is the key problem in HOL4; the student is continuing to explore techniques and is engaging with experienced researchers in the field.

The PhD student is also pursuing other avenues to strengthen the guarantees offered by the CakeML project. During an internship at Arm, he investigated the Sail ecosystem (Sewell et al.) and its applications in formal reasoning about the semantics of the Arm instruction set architecture. He has made progress in verifying the existing Arm ISA model used in CakeML proofs against the official machine-checked model released by Arm, automatically extracted to HOL4 via Sail. He is also contributing to PureCake, a pure language in the style of Haskell (Peyton Jones et al.) , which will compile to CakeML to permit end-to-end verification.

RELATED GRANTS. Dr Scott Owens, EPSRC Grant: “Trustworthy Refactoring”, 09/2016-03/2020, £728,766.

PUBLICATIONS. [1] H. Férée, J. Å. Pohjola, R. Kumar, S. Owens, M. O. Myreen, and S. Ho “Program Verification in the Presence of I/O Semantics, verified library routines, and verified applications”, VSTTE 2018. [2] O. Abrahamsson, S. Ho, H. Kanabar, R. Kumar, M. O. Myreen, M. Norrish, and Y. K. Tan, “Proof-Producing Synthesis of CakeML from Monadic HOL Functions”, JAR 2020.

IMPACT. CakeML is an important strategic research project for Data61. The ‘strategic’ tag means that we support it as a future-looking endeavour that does not necessarily have a short-term payoff, nor necessarily receives external funding. Instead, we pursue such projects because of our belief that they have high potential for future impact. That said, some of our work on CakeML does receive external funding in the form of money from a multi-million dollar DARPA-funded research projects. This project sees CakeML being integrated into high-assurance systems development, in collaboration with staff at Collins Aerospace. CakeML’s ongoing development makes it a stronger and stronger component (better performing and more featureful) in existing and future systems of this sort. As we work with research collaborators around the world and continue to produce CakeML-related publications in the academic press, we also continue to demonstrate our commitment to the CakeML project as a vehicle for pure research.” – Michael Norrish, Principal Researcher, Data61, Australia –