Name: Seminar: Have Performance and Eat Abstraction Too
Start: 2015-05-18T14:00:00+00:00
End: 2015-05-18T15:30:00+00:00
Location: Huxley 308

Have Performance and Eat Abstraction Too: Optimizing Nested Parallel Patterns for Heterogeneous Architectures

Presented by Kunle Olukotun, Pervasive Parallelism Lab, Stanford University.

Abstract
High performance in modern computing platforms requires programs to be parallel, distributed, and to run on heterogeneous hardware. However, programming for this environment is extremely challenging due to the need to use multiple programming models and then combine them together in ad-hoc ways. To optimize applications both for modern hardware and for modern programmers we need a programming model that is sufficiently expressive to easily support a variety of applications and sufficiently portable to support a variety of heterogeneous parallel hardware. Single level parallel patterns have become an increasingly popular high-level programming model which have been shown to be capable of targeting hardware targets as varied as large data centers, GPUs, and FPGAs. Nested parallel patterns are are even more expressive, but present unique challenges when targeting distributed and heterogeneous architectures. In this talk I will describe how to automatically map nested parallel patterns to NUMA machines, clusters, GPUs and if time permits FPGA’s using the Delite framework. I will describe straightforward analyses that determine what data to distribute based on its usage as well as powerful transformations of nested patterns that restructure computation to enable distribution and to optimize for heterogeneous devices. I will describe how Delite’s nested parallel pattern GPU mapping policy improves upon previously published GPU mapping strategies. Finally, I will show the performance of our automatically optimized high-level code on a variety of data analytic benchmarks. The performance results are competitive with manually optimized C++/CUDA (DimmWitted and Cafe) and much better than manually optimized high-level frameworks (Spark and PowerGraph).

Biography
Kunle Olukotun is the Cadence Design Systems Professor in the School of Engineering and Professor of Electrical Engineering and Computer Science at Stanford University. Olukotun is well known as a pioneer in multicore processor design and the leader of the Stanford Hydra chip mutlipocessor (CMP) research project. Olukotun founded Afara Websystems to develop high-throughput, low-power multicore processors for server systems. The Afara multicore processor, called Niagara, was acquired by Sun Microsystems. Niagara derived processors now power all Oracle SPARC-based servers. Olukotun currently directs the Stanford Pervasive Parallelism Lab (PPL), which seeks to proliferate the use of heterogeneous parallelism in all application areas using Domain Specific Languages (DSLs). Olukotun is an ACM Fellow and IEEE Fellow.

CDT Director

CDT Deputy Director

CDT Manager

PhD Programme Manager

Admission Enquiries

Useful Links

News

Scholarships

Seminar: Have Performance and Eat Abstraction Too

May 18, 2015 @ 2:00 pm - 3:30 pm

Event Navigation

Details

Venue

Event Navigation