Wehmeyer L., Marwedel P. Fast, Efficient and Predictable Memory Accesses. Optimization Algorithms for Memory Architecture Aware Compilation

формат pdf
размер 1.63 МБ
добавлен 07 ноября 2011 г.

Издательство Springer, 2006, -262 pp.

The influence of embedded systems is constantly growing. Increasingly powerful and versatile devices are being developed and put on the market at a fast pace. The number of features is increasing, and so are the constraints on the systems conceing size, performance, energy dissipation and timing predictability. Since most systems today use a processor to execute an application program rather than using dedicated hardware, the requirements can not be fulfilled by hardware architects alone: Hardware and software have to work together in order to meet the tight constraints put on mode devices. This work presents approaches that target the software generation process using an energy and memory architecture aware C-compiler. The consideration of energy dissipation and of the memory architecture leads to a large optimization potential conceing performance and energy dissipation.
This work first presents an overview over the used timing, energy and simulation models for one processor architecture and for different memory architectures like caches, scratchpad memories and main memories in both SRAM, DRAM and Flash technology. Following an introduction to the used compilation framework, the compiler based exploitation of partitioned scratchpad memories is presented. A simple formalized Base model is presented that models the consequences of statically allocating instructions and data to several small scratchpad partitions, followed by a number of extensions that treat memory objects and their dependencies at a finer granularity. A method for allocating objects to separate scratchpad memories for instructions and data, as found in the most recent ARM designs, is also presented. Finally, a model that also considers the leakage power of memories is introduced. Results show that significant savings of up to 80% of the total energy can be achieved by using the presented scratchpad allocation algorithms. The flexibility and extensibility of the presented approaches is another benefit.
Many embedded systems have to respect timing constraints. Therefore, timing predictability is of increasing importance. Whenever guarantees conceing reaction times have to be given, worst case execution time (WCET) analysis techniques are being used during the design of the system in order to provide a guaranteed upper bound on the WCET. The contribution of this work deals with the influence of scratchpad memories on timing predictability. It is shown that scratchpad memories, allocated using the algorithms mentioned above, are inherently predictable, since the positions of all objects in the different memories are fixed at compile time and no dynamic decisions have to be taken at runtime. The results show that the determined WCET values for systems with a scratchpad memory scale with the performance benefit observed during average case simulation, indicating that scratchpad memories lead to improvements both conceing average case and worst case. In particular when compared to caches, the WCET analysis for scratchpad based systems is simpler, yet allows the generation of tighter bounds. The effects of allocating instructions and data to the scratchpad using a dynamic allocation algorithm are shown in this work for the first time. This allocation technique both outperforms the cache and leads to better timing predictability, making scratchpad memories a natural choice for timing constrained embedded systems.
Advances in main memory technology include the availability of memory chips with integrated power management. The first optimization targeting main memories exploits these features by allocating memory objects to a scratchpad partition in order to allow the main memory to be put into power down mode whenever instructions and data are being accessed from the scratchpad memory. The allocation problem uses the standby energy of the main memory in SDRAM technology to allocate objects to the scratchpad memory so as to maximize the power down periods of the main memory. Total energy savings of up to 80% were achieved. In the second main memory optimization, suitable Flash memories are being used as instruction memories using eXecute-In-Place (XIP). By considering the tradeoff between the overhead required to copy instructions to the faster SDRAM and the benefits achieved due to the faster execution, the compiler determines an optimal allocation of instructions to Flash and SDRAM memories. The main benefit of this approach is significant savings in the required amount of instruction memory in SDRAM technology, one of the main cost factors for embedded systems.
Finally, the influence of the size of the register file on the quality of the generated code is studied. It is shown that if the register file is too small, then a lot of code overhead is generated due to the need to spill register values to memory. Beside presenting results for the spill code overhead, performance and energy dissipation of the generated code, a compiler guided method to choose an adequate size for the register file for a certain application is presented.

Introduction
Models and Tools
Scratchpad Memory Optimizations
Main Memory Optimizations
Register File Optimization
Summary
Future Work

Похожие разделы

Смотрите также

Aho A.V. et al Compilers: Principles, Techniques, & Tools

формат pdf
размер 48.24 МБ
добавлен 03 января 2012 г.

2 edition. Addison Wesley, 2007. 1009 p. ISBN-10:0321547985 This book provides the foundation for understanding the theory and pracitce of compilers. Revised and updated, it reflects the current state of compilation. Every chapter has been completely revised to reflect developments in software engineering, programming languages, and computer architecture that have occurred since 1986, when the last edition published. The authors, recognizing th...

Aho A.V., Sethi R., Ullman J.D. Compilers: Principles, Techniques and Tools

формат djvu
размер 17.39 МБ
добавлен 03 января 2012 г.

Addison Wesley, 2002. 811 p. ISBN:7115099162, 0201100886 Principles, Techniques, and Tools is a famous computer science textbook by Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman about compiler construction. Although decades have passed since the publication of the first edition, it is widely regarded as the classic definitive compiler technology text. It is known as the Dragon Book because its covers depict a knight and a dragon in battle....

Kakde O.G. Algorithms for Compiler Design

формат pdf
размер 6.53 МБ
добавлен 08 февраля 2012 г.

Издательство Charles River Media, 2002, -334 pp. A compiler translates a high-level language program into a functionally equivalent low-level language program that can be understood and executed by the computer. Crucial to any computer system, effective compiler design is also one of the most complex areas of system development. Before any code for a modern compiler is even written, many programmers have difficulty with the high-level algorithms...

R?thing O. Interacting Code Motion Transformations. Their Impact and Their Complexity

формат pdf
размер 1.56 МБ
добавлен 28 ноября 2011 г.

Издательство Springer, 1998, -223 pp. Is computing an experimental science? For the roots of program optimization the answer to this question raised by Robin Milner ten years ago is clearly yes: it all started with Donald Knuth’s extensive empirical study of Fortran programs. This benchmark-driven approach is still popular, but it has in the meantime been complemented by an increasing body of foundational work, based on varying idealizing assump...

Wilhelm R., Seidl H. Compiler Design. Virtual Machines

формат pdf
размер 1.12 МБ
добавлен 07 ноября 2011 г.

Издательство Springer, 2010, -196 pp. Compilers for high-level programming languages are software systems which are both large and complex. Nonetheless, they have particular characteristics that differentiate them from the majority of other software systems. Their functionality is (almost) completely well-defined. Ideally, there exist completely formal, or at least rather precise, specifications of the source and target languages. Often addition...