Computing systems may be composed of reliable and efficient components, but reliability of the overall system comes at the cost of large inefficiencies due to over-engineering. Information and Communication Technologies (ICT) are responsible for more than 10% of electricty consumption worldwide, a figure that is expected to grow to more than 14% by 2020. Though the GeSI SMARTer 2020 report claims that ICT-enabled solutions has the potential to reduce greenhouse gas emissions by 16.5% in 2020, there is clearly a need for computing systems to reduce their own contribution to these emissions.
Every computing system today employs feedback in some form or other to guarantee a certain level of performance and reliability in the presence of uncertainty, such as unpredictable work-loads, delays, data losses, cyber attacks and component failures. Data, tasks and resources (such as processors, memory, storage and communication networks) need to be managed to balance loads, achieve a certain quality of service, guarantee that computations are correct and ensure that tasks are completed before deadlines. Feedback is also used to minimize power consumption by dynamic voltage and frequency scaling and smart scheduling of jobs. What is lacking, however, is a complete theory that allows computer engineers to understand how components interact with each other and what effect this has on overall system behavior.
The aim of our research is therefore to develop new control theory and mathematical optimization methods for designing computing systems that are at least one order of magnitude more energy efficient, cheaper, faster, smaller and more reliable than today. Our work is currently focused on achieving this by developing: (i) efficient optimization-based algorithms for hard and soft real-time scheduling problems; (ii) scalable distributed scheduling algorithms based on recent methods in cooperative control theory; (iii) methods for modeling computing systems that capture the dynamics essential for feedback design using closed-loop, rather than open-loop metrics.
For an overview of some of the research questions that we are currently working on, see the paper http://arxiv.org/abs/1510.01135