Tez No İndirme Tez Künye Durumu
509116
The system-on-a-chip lock cache /
Yazar:BİLGE EBRU SAĞLAM AKGÜL
Danışman: Prof. VINCENT J. MOONEY
Yer Bilgisi: Georgia Institute of Technology / Yurtdışı Enstitü
Konu:Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol = Computer Engineering and Computer Science and Control ; Elektrik ve Elektronik Mühendisliği = Electrical and Electronics Engineering
Dizin:
Onaylandı
Doktora
İngilizce
2004
142 s.
The objective of this thesis is to implement e±cient lock-based synchronization by a novel, high performance, simple and scalable hardware technique that is easily applicable to a shared-memory multiprocessor System-on-a-Chip (SoC). Our solution is provided in the form of an intellectual property (IP) hardware unit which we call the SoC Lock Cache (SoCLC). The SoCLC provides e®ective lock hand-o® by reducing on-chip memory tra±c and improving performance in terms of lock latency, lock delay and bandwidth consumption. In our methodology, lock variables are accessed via SoCLC hardware. The SoCLC consists of one-bit registers to store lock variables and associated control logic to e®ectively implement the lock hand-o® via interrupt generation, which eliminates busy-wait problems. In this way, the SoCLC eliminates the use of the main memory bus for unnecessary spinning and thus enables the memory bandwidth to be available for other useful work. On the other hand, unlike the related previous work in the literature, the SoCLC does not require any special atomic assembly instructions (e.g., compare-and-swap, test-and-set, load-linked/store-conditional instructions), extended cache protocol(s), extra cache lines/tags or any other architectural modi¯cations/extensions to the pro- cessor core. Rather, the SoCLC methodology is a processor/memory/cache-hierarchy independent solution. Our experimental results indicate that SoCLC can achieve 37% overall speedup over traditional locking mechanism in a microbenchmark program with a high con- tention condition for four processor system. Moreover, with increased memory la- tency, the speedup of SoCLC for the same microbenchmark is also increased, achiev- ing up to 107% speedups for a memory latency of 33 clock cycles. We also examine the false sharing e®ect as well as increased CS length e®ect on locking performance. Another set of experiments have been conducted with a database application program for which SoCLC has been shown to achieve speedup of 31% in the overall execution time. To automate SoCLC design, we have also developed an SoCLC-generator tool, PARLAK, that is capable of generating parametrized, synthesizable and user speci¯ed con¯gurations of a custom SoCLC. Using PARLAK with .25¹ TSMC technology and a 10ns clock period, we have generated customized SoCLCs from a version for two processors to a version for four processors occupying up to 37,940 gates of area for 256 lock variables. We have also generated customized SoCLCs for larger number of processors with a 50ns clock period; e.g., an SoCLC version for 14 processors occupied 78,240 gates of area for 256 lock variables. Furthermore, the SoCLC mechanism has been extended to support priority inher- itance with an immediate priority ceiling protocol (IPCP) implemented in hardware, which enhances the hard real-time performance of the system. The experimental re- sults indicate that the SoCLC can achieve up to 43% overall speedups on practical applications. Furthermore, it has been shown in a robot application that with the IPCP mechanism integrated into the SoCLC, all of the tasks could meet their dead- lines (e.g., a high priority task with 250¹s worst case response time could complete its execution in 93¹s with SoCLC, however the same task missed its deadline by com- pleting its execution in 283¹s without SoCLC). Therefore, with IPCP support, our solution can provide better real-time guarantees for real-time systems.