Издательство InTech, 2011, -434 pp.
Brains rule the world, and brain-like computation is increasingly used in computers and electronic devices. Brain-like computation is about processing and interpreting data or directly putting forward and performing actions. Leaing is a very important aspect. This book is on reinforcement leaing which involves performing actions to achieve a goal. Two other leaing paradigms exist. Supervised leaing has initially been successful in prediction and classification tasks, but is not brain-like. Unsupervised leaing is about understanding the world by passively mapping or clustering given data according to some order principles, and is associated with the cortex in the brain. In reinforcement leaing an agent leas by trial and error to perform an action to receive a reward, thereby yielding a powerful method to develop goal-directed action strategies. It is predominately associated with the basal ganglia in the brain.
The first 11 chapters of this book, Theory, describe and extend the scope of reinforcement leaing. The remaining 11 chapters, Applications, show that there is already wide usage in numerous fields. Reinforcement leaing can tackle control tasks that are too complex for traditional, hand-designed, non-leaing controllers. As leaing computers can deal with technical complexities, the tasks of human operators remain to specify goals on increasingly higher levels.
This book shows that reinforcement leaing is a very dynamic area in terms of theory and applications and it shall stimulate and encourage new research in this field. We would like to thank all contributors to this book for their research and effort.
Summary of Theory:
Chapters 1 and 2 create a link to supervised and unsupervised leaing, respectively, by regarding reinforcement leaing as a prediction problem, and chapter 3 looks at fuzzycontrol with a reinforcement-based genetic algorithm. Reinforcement algorithms are modified in chapter 4 for future parallel and quantum computing, and in chapter 5 for a more general class of state-action spaces, described by grammars. Then follow biological views; in chapter 6 how reinforcement leaing occurs on a single neuron level by considering the interaction between a spatio-temporal leaing rule and Hebbian leaing, and in a global brain view of chapter 7, unsupervised leaing is depicted as a means of data pre-processing and arrangement for reinforcement algorithms. A table presents a ready-to-implement description of standard reinforcement leaing algorithms. The following chapters consider multi agent systems where a single agent has only partial view of the entire system. Multiple agents can work cooperatively on a common goal, as considered in chapter 8, or rewards can be individual but interdependent, such as in game play, as considered in chapters 9, 10 and 11.
Summary of Applications:
Chapter 12 continues with game applications where a robot cup middle size league robot leas a strategic soccer move. A dialogue manager for man-machine dialogues in chapter 13 interacts with humans by communication and database queries, dependent on interaction strategies that gove the Markov decision processes. Chapters 14, 15, 16 and 17 tackle control problems that may be typical for classical methods of control like PID controllers and hand-set rules. However, traditional methods fail if the systems are too complex, timevarying, if knowledge of the state is imprecise, or if there are multiple objectives. These chapters report examples of computer applications that are tackled only with reinforcement leaing such as water allocation improvement, building environmental control, chemical processing and industrial process control. The reinforcement-controlled systems may continue leaing during operation. The next three chapters involve path optimization. In chapter 18, inteet routers explore different links to find more optimal routes to a destination address. Chapter 19 deals with optimizing a travel sequence w.r.t. both time and distance. Chapter 20 proposes an untypical application of path optimization: a path from a given patte to a target patte provides a distance measure. An unclassified medical image can thereby be classified dependent on whether a path from it is shorter to an image of healthy or unhealthy tissue, specifically considering lung nodules classification using 3D geometric measures extracted from the lung lesions Computerized Tomography (CT) images. Chapter 21 presents a physicians' decision support system for diagnosis and treatment, involving a knowledgebase server. In chapter 22 a reinforcement leaing sub-module improves the efficiency for the exchange of messages in a decision support system in air traffic management.
Neural Forecasting Systems
Reinforcement leaing in system identification
Reinforcement Evolutionary Leaing for Neuro-Fuzzy Controller Design
Superposition-Inspired Reinforcement Leaing and Quantum Reinforcement Leaing
An Extension of Finite-state Markov Decision Process and an Application of Grammatical Inference
Interaction between the Spatio-Temporal Leaing Rule (non Hebbian) and Hebbian in Single Cells: A cellular mechanism of reinforcement leaing
Reinforcement Leaing Embedded in Brains and Robots
Decentralized Reinforcement Leaing for the Online Optimization of Distributed System
Multi-Automata Leaing
Abstraction for Genetics-based Reinforcement Leaing
Dynamics of the Bush-Mosteller leaing algorithm in 2x2 games
Modular Leaing Systems for Behavior Acquisition in Multi-Agent Environment
Optimising Spoken Dialogue Strategies within the Reinforcement Leaing Paradigm
Water Allocation Improvement in River Basin Using Adaptive Neural Fuzzy Reinforcement Leaing Approach
Reinforcement Leaing for Building Environmental Control
Model-Free Leaing Control of Chemical Processes
Reinforcement Leaing-Based Supervisory Control Strategy for a Rotary Kiln Process
Inductive Approaches based on Trial/Error Paradigm for Communications Network
The Allocation of Time and Location Information to Activity-Travel Sequence Data by means of Reinforcement Leaing
Application on Reinforcement Leaing for Diagnosis based on Medical Image
RL based Decision Support System for u-Healthcare Environment
Reinforcement Leaing to Support Meta-Level Control in Air Traffic Management
Brains rule the world, and brain-like computation is increasingly used in computers and electronic devices. Brain-like computation is about processing and interpreting data or directly putting forward and performing actions. Leaing is a very important aspect. This book is on reinforcement leaing which involves performing actions to achieve a goal. Two other leaing paradigms exist. Supervised leaing has initially been successful in prediction and classification tasks, but is not brain-like. Unsupervised leaing is about understanding the world by passively mapping or clustering given data according to some order principles, and is associated with the cortex in the brain. In reinforcement leaing an agent leas by trial and error to perform an action to receive a reward, thereby yielding a powerful method to develop goal-directed action strategies. It is predominately associated with the basal ganglia in the brain.
The first 11 chapters of this book, Theory, describe and extend the scope of reinforcement leaing. The remaining 11 chapters, Applications, show that there is already wide usage in numerous fields. Reinforcement leaing can tackle control tasks that are too complex for traditional, hand-designed, non-leaing controllers. As leaing computers can deal with technical complexities, the tasks of human operators remain to specify goals on increasingly higher levels.
This book shows that reinforcement leaing is a very dynamic area in terms of theory and applications and it shall stimulate and encourage new research in this field. We would like to thank all contributors to this book for their research and effort.
Summary of Theory:
Chapters 1 and 2 create a link to supervised and unsupervised leaing, respectively, by regarding reinforcement leaing as a prediction problem, and chapter 3 looks at fuzzycontrol with a reinforcement-based genetic algorithm. Reinforcement algorithms are modified in chapter 4 for future parallel and quantum computing, and in chapter 5 for a more general class of state-action spaces, described by grammars. Then follow biological views; in chapter 6 how reinforcement leaing occurs on a single neuron level by considering the interaction between a spatio-temporal leaing rule and Hebbian leaing, and in a global brain view of chapter 7, unsupervised leaing is depicted as a means of data pre-processing and arrangement for reinforcement algorithms. A table presents a ready-to-implement description of standard reinforcement leaing algorithms. The following chapters consider multi agent systems where a single agent has only partial view of the entire system. Multiple agents can work cooperatively on a common goal, as considered in chapter 8, or rewards can be individual but interdependent, such as in game play, as considered in chapters 9, 10 and 11.
Summary of Applications:
Chapter 12 continues with game applications where a robot cup middle size league robot leas a strategic soccer move. A dialogue manager for man-machine dialogues in chapter 13 interacts with humans by communication and database queries, dependent on interaction strategies that gove the Markov decision processes. Chapters 14, 15, 16 and 17 tackle control problems that may be typical for classical methods of control like PID controllers and hand-set rules. However, traditional methods fail if the systems are too complex, timevarying, if knowledge of the state is imprecise, or if there are multiple objectives. These chapters report examples of computer applications that are tackled only with reinforcement leaing such as water allocation improvement, building environmental control, chemical processing and industrial process control. The reinforcement-controlled systems may continue leaing during operation. The next three chapters involve path optimization. In chapter 18, inteet routers explore different links to find more optimal routes to a destination address. Chapter 19 deals with optimizing a travel sequence w.r.t. both time and distance. Chapter 20 proposes an untypical application of path optimization: a path from a given patte to a target patte provides a distance measure. An unclassified medical image can thereby be classified dependent on whether a path from it is shorter to an image of healthy or unhealthy tissue, specifically considering lung nodules classification using 3D geometric measures extracted from the lung lesions Computerized Tomography (CT) images. Chapter 21 presents a physicians' decision support system for diagnosis and treatment, involving a knowledgebase server. In chapter 22 a reinforcement leaing sub-module improves the efficiency for the exchange of messages in a decision support system in air traffic management.
Neural Forecasting Systems
Reinforcement leaing in system identification
Reinforcement Evolutionary Leaing for Neuro-Fuzzy Controller Design
Superposition-Inspired Reinforcement Leaing and Quantum Reinforcement Leaing
An Extension of Finite-state Markov Decision Process and an Application of Grammatical Inference
Interaction between the Spatio-Temporal Leaing Rule (non Hebbian) and Hebbian in Single Cells: A cellular mechanism of reinforcement leaing
Reinforcement Leaing Embedded in Brains and Robots
Decentralized Reinforcement Leaing for the Online Optimization of Distributed System
Multi-Automata Leaing
Abstraction for Genetics-based Reinforcement Leaing
Dynamics of the Bush-Mosteller leaing algorithm in 2x2 games
Modular Leaing Systems for Behavior Acquisition in Multi-Agent Environment
Optimising Spoken Dialogue Strategies within the Reinforcement Leaing Paradigm
Water Allocation Improvement in River Basin Using Adaptive Neural Fuzzy Reinforcement Leaing Approach
Reinforcement Leaing for Building Environmental Control
Model-Free Leaing Control of Chemical Processes
Reinforcement Leaing-Based Supervisory Control Strategy for a Rotary Kiln Process
Inductive Approaches based on Trial/Error Paradigm for Communications Network
The Allocation of Time and Location Information to Activity-Travel Sequence Data by means of Reinforcement Leaing
Application on Reinforcement Leaing for Diagnosis based on Medical Image
RL based Decision Support System for u-Healthcare Environment
Reinforcement Leaing to Support Meta-Level Control in Air Traffic Management