PI: Tom Ridge, University of Leicester

  • Core operating system functionality has been verified in projects such as L4.verified, which targeted the seL4 microkernel.
  • However, two other major components should also be verified: the network stack and the file system.
  • A few verified file systems already exist, but their performance is slow compared to traditional file systems such as ext4 and ZFS.Moreover, they also lack important modern features, such as file system snapshots.
  • We aim to develop a verified file system “ImpFS”, constructed from small, well-defined components. ImpFS should match or exceed existing file systems, both in terms of performance and features. It will be available both as a desktop file system and as a library of components suitable for use in other software, and even in library-based unikernels such as MirageOS.

In previous work, we developed SibylFS, a formal model of POSIX and real-world file systems. The formal specification was usable directly as a test oracle, to check conformance of existing file systems. We now aim to actually implement a verified file system.

The Design of ImpFS

Building a file system is hard. Building a high-performance file system with advanced features is extremely difficult: the BTRFS file system has been in development by a team of engineers for around 10 years and is still not considered stable enough for production use. Thus, we should expect that building a verified high-performance file system will be extremely challenging.

The most important components that we have developed so far are:
– A high-performance novel B-tree-like data-structure with both Copy-on-Write and Mutate-in-place semantics. This is formalised in Isabelle/HOL and extracted to OCaml for execution.
– A persistent cache, offering transactional-log-like functionality.
– A persistent key-value store.

These components are freely available online. All the components are implemented in a purely-functional style, which is a pre-requisite for easy verification. The performance of the components is extremely good. For example, the key-value store matches the performance of the well-regarded LMDB key-value store.

At OCaml’2020 we demonstrated a working file system. We are now in the process of tuning the implementation. We are confident that the file system will out-perform existing file systems in many areas. We are now working with industry collaborators and the MirageOS project to push this work into production. More importantly, the components will serve as the basis for many further interesting file system designs.

The VeTSS funding has been critical for securing research time for the author. Most of the work has involved high-level design and low-level component implementation. In addition, a lot of effort has been put into performance engineering of the components. This work is lengthy and would not have been possible without the financial support provided by VeTSS.

PUBLICATIONS. [1] T. Ridge, “A B-tree library for OCaml”, ICFP 2017, OCaml’17 workshop; [2] Towards verified file systems. A. Giugliano, 2018. PhD thesis; [3] T. Ridge, “A Key-Value store for OCaml” ICFP 2019, OCaml’19 workshop. [4] T. Ridge., “The ImpFS filesystem“, IFCP2020, OCaml’20 workshop.

IMPACT. “I have been working, with a PhD student, on a verified copy-on-write filesystem. In the course of our work, we have relied heavily on Dr. Ridge’s ImpFS work. The availability of an existing verified copy-on-write B-Tree implementation has been of great help. We continue to use it heavily as a reference point for not only our implementation, but also for an example of formally specifying this crucial class of data structures.” – Colin S. Gordon, Drexel University, Pennsylvania –