204 7 Optimal Control of Stochastic Processes
along the solution X
S
(t) of the evolution equation
˙
X = F (X, u, t) but without
solving this equation explicitly. Obviously, the exponent of (7.47) is nothing
but the deterministic performance integral (7.13) with free end conditions,
which means that the open loop stochastic control converges as expected for
vanishing diffusion coefficients to the deterministic open loop control. In this
special case, the tree approximation becomes a rigorous result. But this state-
ment also means that the tree approach becomes a reasonable approximation
for small diffusion coefficients, i.e., for small stochastic perturbations of the
system.
Finally, we remark that the Hamilton representation, discussed in
Sect. 2.4.2, and the Pontryagin’s maximum principle, see Sect. 2.4.3, can also
be extended to the tree approximation of the stochastic control problem.
There also exist other approaches to the open loop control problem. A
popular alternative method is the expansion of the optimal performance and
the control functions in powers of small noise terms [7, 8] or the method of
stochastic integration [9].
7.3 Feedback Control
7.3.1 The Control Equation
Now we consider the situation that the controller has information about the
current state and the history, but not about the future evolution of the sto-
chastic terms. This is a typical feature of a feedback control. In the literature,
two general types of feedback control are discussed [11, 12, 13]. We speak
about a complete observation if the controller has the full information of the
current state X(t) and its history. Otherwise, if the controller have only in-
formation about the history and the current values of a certain ‘observable’
part of the state, we denote this situation as the case of partial observation. In
the following we focus mainly on the complete observation case, while partial
observations will be considered in the subsequent chapters.
We suppose that the current time is τ. As a criterion to be minimized
we now use the expected value of the future performance with respect to the
current initial state X(τ )=Y .Thus(7.15) has now the concrete form
J[Y, τ,u,T]=
T
τ
dt φ(t, X(t),u(t))
X(τ )=Y
=
T
τ
dt
dXφ(t, X, u(t))p
u
(X, t | Y,τ) , (7.48)
where we have supposed that each feedback control u corresponds to a tran-
sition probability p
u
(X, t | Y,τ) from the state Y at time τ to the state X
at time t considering the presence of the control law u. The characteristic
structure of the feedback control applied at time t is given by