PI: Greta Yorsh, Queen Mary, University of London

  • Optimising compilers for Single-Instruction-Multiple-Data (SIMD) architectures rely on sophisticated program analyses and transformations.
  • Correctness hard to prove due to interaction between optimisation passes and SIMD semantics/costs.
  • Supervectorizer: integration of unbounded superoptimization with auto-vectorisation enables software to take full advantage of SIMD capabilities of existing and new microprocessor designs.
  • Potential for fundamental advances in SMT solvers and industrial-strength SIMD optimizing compilers.

Optimising compilers for Single Instruction Multiple Data (SIMD) architectures rely on sophisticated program analyses and transformations. In particular, auto-vectorisation is designed to automatically identify and exploit data-level parallelism. To deliver expected performance improvements, compiler writers resort to changing optimisation passes, heuristics, and cost models. This process is highly challenging even for the few experts who possess the required range of skills, and any errors introduced affect the entire software stack, likely compromising its reliability and security.

Ensuring correctness of these compiler optimisations is hard due to implicit interactions between optimisation passes and abstruse details of SIMD instructions semantics and costs. It results in missed optimisation opportunities and subtle bugs, such as miscompiled code, which might remain undiscovered for a long time and manifest themselves in obscure ways across abstraction layers of a software stack.

This project aimed at enabling software to take full advantage of SIMD capabilities of microprocessor designs, without modifying the compiler. In particular, we integrate unbounded superoptimization with auto-vectorisation. This approach reduces the engineering effort needed to tune a production compiler for new SIMD architectures and improves compiler reliability without compromising the performance of generated code. We believe that this approach will lead to fundamental advances in SMT solvers and industrial-strength optimising compilers targeting SIMD architectures.

Structure of the preliminary prototype

The project also funded the work of postdoctoral research assistant Julian Nagele, who developed a robust prototype implementation and experiments with SIMD instructions, engaging with the LLVM community and obtaining valuable early-stage feedback from developers at EuroLLVM 2018.

PUBLICATIONS. [1] Initial results were presented, by invitation, at Intel’s Compiler, Architecture and Tools Conference (CATC).

RELATED GRANTS. The work on this project has led directly to the award to Dr Yorsh of ERC Starting Grant, £1.25M, 2018-2022. Initial results obtained under VeTSS funding demonstrated feasibility of the proposed ERC plan and the work under ERC will build on the infrastructure and experimental results obtained under VeTSS funding. The project also received further VeTSS funding, Supervectorizer II.

IMPACT. The quantitative trading firm Jane Street expressed interest in incorporating techniques developed under this grant into the compiler for OCaml. Amazon has also invited Dr Yorsh to join as Amazon Scholar to work with Amazon Video on tools for improving correctness and performance of their code.