
30
CHAPTER
1. CALCULUS
REVIEW.
OPTIONS.
In
practice,
the
partial
derivative
%!i
(x)
is
computed
by
considering
the
variables
xl,
...
, Xi-I,
xi+
1,
.,
. , xn
to
be
fixed,
and
differentiating f (x) as a
function
of
one variable
Xi.
A
compact
formula for (1.35)
can
be
given as follows: Let
ei
be
the
vector
with
all entries equal
to
°
with
the
exception
of
the
i-th
entry, which is
equal
to
1, i.e., ei(j) = 0, for j
of-
i, 1
::;
j
::;
n,
and
ei(j) =
1.
Then,
Partial
derivatives
of
higher
order
are
defined similarly. For example,
the
second
order
partial
derivative
of
f (x) first
with
respect
to
Xi
and
then
with
respect
to
Xj,
with
j
of-
i,
is
denoted
by
8~j2txi
(x)
and
is equal
to
while
the
second
and
third
partial
derivatives of f
(x)
with
respect
to
Xi
are
denoted
by
r;;
(
x)
and
~:{
(x), respectively,
and
are given
by
, ,
While
the
order
in
which
the
partial
derivatives
of
a given
function
are
computed
might make a difference, i.e.,
the
partial
derivative
of
f(x)
first
with
respect
to
Xi
and
then
with
respect
to
Xj,
with
j
of-
i, is
not
necessarily
equal
to
the
partial
derivative
of
f(x)
first
with
respect
to
Xj
and
then
with
respect
to
Xi,
this
is
not
the
case if a function is
smooth
enough:
Theorem
1.9.
If
all the partial derivatives
of
order k
of
the function
f(x)
exist and
are
continuous, then the order
in
which partial derivatives
of
f(x)
of
order at most k is computed
does
not
matter.
Definition
1.3. Let f :
lR
n
-----7
lR
be
a function
of
n variables and assume that
f(x)
is differentiable with respect to all variables
Xi,
i = 1 : n. The gradient
D
f(x)
of
the function
f(x)
is the following row vector
of
size
n:
(
of
of
of)
Df(x)
=
~(x) ~(x)
'"
~(x)
.
UXl
UX2
UX
n
(1.36)
1.6.
MULTIVARIABLE
FUNCTIONS
31
Definition
1.4.
Let f :
lR
n
-----7
lR
be
a function
of
n variables. The Hessian
of
f(x)
is denoted
by
D2
f(x)
and is defined
as
the following n x n matrix:
8
2
f (x)
8
2
f 8
2
f
8xi
8X28xl
(x)
8x
n
8x
l (x)
8
2
f
8
2
f
(x)
8
2
f
D2
f(x)
8x
1
8x2
(x)
8x~
8x
n
8x
2 (x)
(1.37)
8
2
f
8
2
f
8
2
f
(x)
8x
1
8x
n
(x)
8x
2
8x
n
(x)
8x;
Another
commonly
used
notations
for
the
gradient
and
Hessian
of
f
(x)
are
\7f(x)
and
Hf(x),
respectively. We will use
Df(x)
and
D2f(x)
for
the
gradient
and
Hessian
of
f ( x
),
respectively, unless otherwise specified.
Vector Valued Functions
A function
that
takes
values
in
a multidimensional space is called a vector
valued function. Let F :
lR
n
-----7
lR
m
be
a
vector
valued function given
by
Definition
1.5.
Let F :
lR
n
-----7IR
m
given
by
F(x)
= (fj(X))j=l:m, and assume
that the functions fj (x),
j = 1 : m,
are
differentiable with respect to all
variables
Xi,
i = 1 :
n.
The gradient
DF(x)
of
the function
F(x)
is the
following
matrix
of
size m x
n:
DF(x)
(
8h(x) 8h(x)
...
8h(X))
8Xl
8X2
8x
n
8h
(x) 8h (x)
...
8h (x)
8Xl
8X2
8x
n
..
.
..
.
.,
.
8fm
(x)
8fm
(x)
8fm
(x)
8Xl
8X2
8x
n
(1.38)
If
F :
lR
n
-----7
lR
n
,
then
the
gradient
DF(x)
is a
square
matrix
of size n.
The
j-
th
row
of
the
gradient
matrix
D F (x) is
equal
to
the
gradient D
/j
(x)
of
the
function
fj(x),
j = 1 : m; cf. (1.36)
and
(1.38). Therefore,
(
Dh(x)
)
DF(x)
=
Df~(X)
.
Dfm(x)