PI: Daniel Kroening, University of Oxford, Amazon, Co-I: John Galea, University of Oxford

  • Generic Taint Analysis is a flexible technique that enables the enforcement of different taint policies via the same underlying taint tracking system.
  • However, generic taint analysis incurs severe performance overheads.
  • We introduce the Taint Rabbit, an optimized generic taint engine capable of analysing x86 binary applications.
  • Its enhanced performance is based on optimizations that we have investigated and relate to fast path generation and vectorization.
  • Overall, the work done acts as a foundation for further research on security vulnerabilities.Results show that with our optimizations, the Taint Rabbit performs faster than existing generic trackers.

Dynamic taint analysis is a pivotal technique in software security that enables the tracking of interesting/suspicious data as it flows during execution. With the essential funding provided by VETSS, we have researched approaches for optimizing an expensive but generic variant of the analysis. Crucially, the analysis supports various user-defined policies via the same underlying taint tracking system. The research is carried out in an effort towards our long-term goal of automatically detecting and analysing software vulnerabilities and reasoning over their exploitation.

The main performance bottleneck of taint analysis stems from the execution of taint propagation code that is intensively instrumented into the target application at instruction granularity. Unlike specialised bitwise tainting, a generic taint tracker cannot be optimised for a specific taint policy. Instead, it must perform elaborate propagation in order to be versatile. We adopt two strategies to address the performance issue. First, we aggressively elide the execution of propagation routines whenever possible, by generating fast paths that result in basic blocks being instrumented based on frequent taint contexts identified at runtime. Second, we directly optimize the code that is responsible for actually conducting taint propagation, leveraging vectorization so that all taint information pertaining to source operands of a given instruction are processed simultaneously.

Our research has led to the development of the Taint Rabbit, a novel generic taint tracker that uses our proposed techniques. We evaluated our approach on a number of real-world applications including Apache, PHP, and bzip2, as well as on CPU-intensive benchmarks such as SpecCPU 2017. Results indicate that the Taint Rabbit is the fastest generic taint engine amongst those we assessed. Furthermore, to demonstrate the flexibility of the Taint Rabbit, we also developed several taint-based applications using our versatile system despite their dependence on different taint propagation policies. In particular, we considered Use-After-Free debugging, control-flow hijack detection, and vulnerability discovery through fuzzing.

High-level design of the Taint RabbitOverall, VETSS has given us the opportunity to engage in imperative research which resulted in generic taint tracking to scale better for binaries than the current state-of-the-art. The Taint Rabbit serves as a vital stepping-stone to automatically analyse and understand security-critical software vulnerabilities.

PUBLICATIONS.  [1] John Galea and Daniel Kroening. 2020. The Taint Rabbit: Optimizing Generic Taint Analysis with Dynamic Fast Path Generation. ASIACCS ’20.

IMPACT. The Taint Rabbit and all tools built upon it will be made open-source upon publication. We have also made several contributions to DynamoRIO, the open-source DBI system that the Taint Rabbit uses. In this regard: “John has had significant impact on the open-source DynamoRIO project: he has contributed numerous fixes and features to the code base; he has joined the small set of core developers who voluntarily help maintain the continuous integration testing and other infrastructure; he has influenced design decisions for new features by other developers; and he has helped to build the community around this project.”  – Derek Bruening, Software Engineer, Google –