Research

[Home] [Projects] [Research] [Publications] [Resume] [Contact Information] [Links

 

 

The focus of my work has been to enable robust computing in the face of emerging challenges in Chip Multiprocessors (CMPs). Robustness is defined as the capability to perform without failure under a wide range of conditions. Computers today are all pervasive and touch every facet of our lives. A primary requirement that any user places on such systems is that they work without failure or interruption. A secondary need is that these systems should work as fast as possible. One of the fundamental problems is that performance and robustness are divergent goals. On one hand, the most robust computer system is one that is not turned on and on the other hand, the most high-performing system is the one susceptible to most failures.

 

Robustness can be interpreted differently in different contexts. In the context of networks, robustness is the quality of fault tolerance provided by them. That is, networks should either not show any effect of a fault or should degrade gracefully in the face of that fault. The term autonomic computing is used to describe the self managing, self organizing and self healing nature of certain systems such as high-end servers. Autonomic systems are expected to show resilience (robustness) to failures in one or more of the components of the systems while meeting guaranteed levels of performance. Robustness in the context of secure systems is the ability to withstand attacks to the confidentiality and integrity of information. Robustness in the context of software systems is the ability to provide meaningful output under all input, regardless of whether the input was expected or not.

 

The scope of my work has been to improve certain aspects of the robustness of modern computer systems. Modern architectures are relentlessly heading toward increasing levels of parallelism. There is a clear trend towards increasing number of processors on chip and the number of application threads running together in parallel. Such architectures have been classified as Chip Multiprocessors (CMPs). In essence, a CMP based architecture is one that has multiple processors on single chip that may communicate with each other through a communication bus or on-chip network. As CMP based architectures grow increasingly popular, solutions that are targeted at them will have wide ranging impact if they are adopted by designers and architects. Keeping this in mind, the focus of my work has been to study robustness in the context of CMP based architectures from different perspectives.

 

The three threats to robustness in CMPs that are studied by are, thermal emergencies, soft errors, and the security of data and code. Each threat can adversely impact the proper execution of an application and thus require solutions. Thermal emergencies can cause hardware faults; security breaches can cause loss if intellectual capital or valuable data and soft errors can lead to correctness issues in the results of computation. There are many solutions that have been proposed to address these threats to robustness at the hardware level. While these solutions are definitely useful, solutions at the software level exploit knowledge about the application behavior as well and can also work in tandem with existing hardware approaches. Therefore, my work has concentrated on software based techniques for robust computing on Chip Multiprocessors.

 

  

Last updated: Apr 2008