I am a Lecturer in Computing at Imperial College. My research focuses on the management and processing of big scientific data, thereby bridging the gap between data management and the sciences. Driven by the needs of scientists dealing with unprecedented amounts of data (e.g., neuroscientists of the Blue Brain Project), I develop novel algorithms to analyze & manage big scientific data so it can be turned into knowledge.
I hold a Ph.D. and a M.Sc. in Computer Science, both from the Swiss Federal Institute of Technology in Zürich (ETH Zürich), and was a Postdoctoral fellow in the DIAS Lab at EPFL. In 2004 I was awarded a Fulbright scholarship to visit Purdue University.
Tackling Big Data Challenges in the Sciences
My research focuses on bridging the gap between data management and scientific disciplines. Scientists in disciplines like biology, chemistry, physics etc. produce vast amounts of data through experimentation and simulation. The amounts of data produced are already so big that they can barely be managed and the problem is certain to become worse as the volume of scientific data doubles every year.
My current research therefore focuses on developing novel data management algorithms for querying and analyzing big scientific data. The unprecedented size and growth of scientific datasets makes analyzing them a challenging data management problem. Current algorithms are not efficient for today’s data and will not scale to analyze the rapidly growing datasets of the future. I therefore want to develop next generation data management tools and techniques able to manage tomorrow’s scientific data, thereby putting efficient and scalable big data analytics at the fingertips of scientists so they can again focus on their science, unperturbed by the challenges of big data.
Key to my research is that the algorithms and methods developed are inspired by real use cases from other sciences and that they are also implemented and put to use for scientists. The algorithms developed so far are inspired by a collaboration with the neuroscientists from the Blue Brain project (BBP) who attempt to simulate the human brain on a supercomputer.
During my Ph.D. my research focused on developing tools to process big scientific data. Only with a massive scale-out, i.e., through distributing computations in a cluster, can scientific data be processed and queried on a massive scale. I consequently conceived algorithms and developed tools to process data distributed in a cluster or the cloud. The results are available in my dissertation (also available in book form).
My research areas broadly touch the following subjects:
- Big Data, Distributed Indexing & Processing
- Scientific Data Management
- Spatial Data, Spatial Indexing
- Spatio-Temporal Indexing
- In-Memory Indexing
We are hiring!
We are currently looking to find junior researchers interested to complete a Ph.D. in the broadly defined area of scientific data management. If interested, please check out the jobs page.