Roy K.K. Potential theory in applied geophysics

Подождите немного. Документ загружается.

17.7 Singular Value Decomposition (SVD) 579

∞



(r) m (r) dr (17.44)

where d

is the data in the data space and m (r) is the mo del at a radial

distance r, G

(r) is the Kernel function. This functio n is also called the Frechet

kernel or the Green’s function. This continuous Fredhoms integral can be

written in the discrete form as



(r) m

(r) . (17.45)

Equation (17.45) can b e written in the matrix form as

d = Gm (17.46)

where d is a n×1 column vector of n data points. m is the m×1 column vector

of model parameters and G is a rectangular matrix and linear diﬀerential

operator or linear op erator which connects the data to the model space. Here

d={d

, d

.....d

}

for i = 1,....N and (17.47)

m={m

, m

.....m

}

for j = 1 to m (17.48)

and T is the transpose. T changes the column vector to a row vector.

Let

d=Gm. (17.49)

It is taken to another plane where the connecting equation between the data

and the model space is given by

′

. (17.50)

The connecting link is established by multiplying both the sides of the (17.19)

by u where u is the eigen vector in the n-space or data space. Since eigen vector

matrix is an orthogonal matrix we can write

′

d (17.51)

′

m. (17.52)

From (17.51) and (17.52) we can write

′

=uGm

′

(17.53)

⇒ u

′

uGm

′

(17.54)

⇒ d

′

(17.55)

since Gu = uλ (17.56)

where λ and u are respectively the eigen value and eigen vector matrix. Hence

580 17 Inversion of Potential Field Data

Gu = u

uλ = λ. (17.57)

Therefore

′

= λm

′

. (17.58)

So a square matrix could be changed to diagonal eigen value matrix in the

transformed plane. This is the principal axis transformation. We can write

′

= λ

−1

′

(17.59)

⇒ u

m=λ

−1

⇒ uu

m=uλ

−1

⇒ m=uλ

−1

d. (17.60)

This is the generalized inverse of a square matrix. For a rectangular system

(n × m)

d=ud

′

, or d

′

d (17.61)

and

m=νm

′

or m

′

= ν

m (17.62)

Therefore,

d=Gm

⇒ ud

′

=Gνm

′

⇒ u

′

Gνm

′

⇒ d

′

Gνm

′

(17.63)

where ν is the eigen vector matrix for the m space or the parameter space. In

an n ×msystemuandν are respectively the eigen vector matrices for the n

and m spaces respectively.

Since Gν =uλ (17.63) in an n ×m system (Lanczos 1941) we can write

Gν =u

uλ = λ (17.64)

⇒ uu

Gν =uλ (17.65)

⇒ Gνν

=uλν

⇒ G=uλν

. (17.66)

Since both u and ν are orthogonal matrices hence the generalized inverse is

−1

= νλ

−1

. (17.67)

It is used for inversion of geophysical data.

17.7 Singular Value Decomposition (SVD) 581

For an arbitrary n x m system we can have two matrix equations

Ay=b

n ×mm× 1n× 1

(17.68)

and

x=c

m ×nn× 1m× 1

(17.69)

where (17.69) is the self adjoint of (17.68) (Lanczos 1941). In (17.68) A has n

rows and m columns and transforms the column vector y of m components to

column vector b of n components. The matrix A

of (17.69) has m rows and

n columns. The vector x and c are in a reciprocity relation to the vectors y

and b. x and b are the vectors in the n space and y and c are the vectors in

the m space. Lanczos (1941) has shown that combining these two equations

and transferring to a larger n + m space we get the matrix

Sz = a. (17.70)

Figure 17.8 shows the n + m system and the lo cations of A, A

,y,b,xand

c. The eigen value equation of this (n + m) × (n + m) square matrix is

Sω = λω (17.71)

This eigen value matrix equation can disintegrate into a pair of eigen value

equations for an n × msystem.Theseare

Aν = λu

u = λv

. (17.72)

These equations are termed as shifted eigen value equa tions where u and ν

are respectively the eigen vectors in the n a nd m space respectively. λ,is

the diagonal matrix of eigen values. Here u and ν on the right hand side has

shifted their positions. If we postmultiply the ﬁrst equation by A

and second

equation by A, we get

Fig. 17.8. Matrix in a n + m space; shifted eigen value problem (Lanczos 1941)

582 17 Inversion of Potential Field Data

u=λ

Aν = λ

. (17.73)

Equation (17.73) is a regular eigen value problem. Here AA

and A

Aare

square symmetric matrices in the n and m spaces respectively. It is interesting

to note that signiﬁcant eigen values for both the n × nandm× m matrices

will be same. In other words, n × n matrix will have o nly m signiﬁcant eigen

values and they ar e equal to those of m × m system. (n − m) eigen values

for AA

are trivial and sum of the eigen values for both the matrices will be

exactly same.

Most of the geo p hysical inverse problems are nonli near. We linearise the

nonlinear problem by truncating higher order terms of the Taylor’s series

expansion. Let G (P, x) the written as

G(P, x) = G



, x





i=1

∂G

∂P

∆P + higher order terms (17.74)

G is termed as the gross earth functional, P is a vector of unknown values

i.e., the parameters to be determined and x is a column vector of n known

quantities. x va ri es from subject to subject. As for example for dc resistivity

sounding x stands for electrode separation, in magnetotellurics x stands for

periods of the MT signal, in electromagnetic frequency sounding x stands for

frequency etc. Since G (P

, x) is an initial choice of the model parameters or

a priori mo del. We can write (17.74) as

∆G = A∆ P (17.75)

where ∆ P is the diﬀerence between the actual and the initial choice model

parameters. i.e

∆P =





− m

Prior





− m

Prior





− m

Prior



mx1



. (17.76)

Here ∆P is a (m ×1) column vector in the m or model space and changes in

successive iterations. m is the number of parameters to be modiﬁed.

Column vector, ∆ G is the diﬀerence between d

Observed

i.e., the ﬁeld data

and d

Pre

or d

Predicted

, the synthetic data .d

pre

are obtained from computation

of the forward mo del. The column vector ∆ G is

17.7 Singular Value Decomposition (SVD) 583

∆G =





Obs

− d





Obs

− d





Obs

− d

Pr e



n ×1



. (17.77)

It is an n × 1 column vector in the D-space or data space. The connecting

link between the two spaces are obtained from the n ×m rectangular matrix

n×m

. A is termed as the sensitivity matrix or derivative matrix or the linear

diﬀerential operator in a linearisable problems. Here A is





∂G

∂P



..........



∂G

∂P





∂G

∂P



..........



∂G

∂P





(17.78)

where derivatives of the g ross earth functionals with respect to all the param-

eters must exist for existence of the sensitivity matrix and therefore for exis-

tence of an inverse problem. Fortunately for most of the geophysical problems

the frechet derivatives exist. For an over determined problem from (17.75) we

can write

∆P = A

−g

∆G (17.79)

where A

−g

is the generalized inverse. We can write (17.79) from (17.67) as.

∆P = νλ

−1

∆G (17.80)

Equation (17.80) is the basic equation for inversion of geophysical data using

singular value decomposition (Lanczos, 1941, Inman et al 1973, Glenn et al

1973). Here ∆P is the model mo diﬁcation (m ×1) vector. v is the parameter

eigen vector in the M-space. u is the data eigen vector in the n-space and λ is

the eigen value which remains the same both in data and model space. The

matrix AA

n×n

is a squa re symmetric matrix having the eigen values λ

salong

the diagonal and eigen vector u. The matrix A

m×m

is the square symmetric

matrix having the same eigenvalues λ

s and the eigen vector v. The trace

Trace AA

= Trace A

A =



i=1



i=1

. (17.81)

Equation (17.80) with all non zero eigen values of the system matrix is given

584 17 Inversion of Potential Field Data

∆P = v λ

−1

∆G

m ×1m× mm×mm×nn× 1

(17.82)

If k < m, then eliminating zero eigen values we get

∆P = vλ

−1

∆G

m ×kk× kk× nn× 1

. (17.83)

Eliminating very small eigen va lues, which bring instability we get

∆P = v λ

−1

∆G

m ×qq×qq× nn× 1

(17.84)

where q(q < k < m) is the number of signiﬁcant eigen values. Although u is

an n × n square matrix. We take only the rows for the number of non-zero

eigen values. Higher the value of k better will be the quality of inversion.

Both data and parameter eigen vector matrices u and v are orthogonal

matrices. Therefore we get

(17.85)

and

(17.86)

Equation (17.85) is known as Information density matrix and (17.86) is known

as Resolution matrix. Both are theoretically identity matrices. While dealing

with ﬁeld data one may get diagonally dominant matrices. Information density

matrix will indicate qualitatively about the number of parameters which are

contributing towards the total signals. Resolution matrix give a qualitative

indication about the resolution of the parameters. In other words when the

values are unity, one gets b est resolution. Diagonal elements will not deviate

signiﬁcantly from unity for proper resolution.

17.8 Least Squares Estimator

In least squares estimator problem, let x

, .........x

are inde-

pen d ent variables and y

are dependent variables such that

+ .........+a

(17.87)

+ .........+a

. (17.88)

For ﬁtting in a format of the type

Y=a

+ ............+a

(17.89)

The method of least squares say that the best representative curve is the one

for which the sum of the square of the residual is minimum, i.e.,

17.8 Least Squares Estimator 585

f =



i=1

(y − (a

+ a

+ ......+ a

))

(17.90)

is minimum.

The conditions for f (a

, .........a

) to be minimum are

∂f

∂a

= −2



(y − a

− a

......a

)=0, (17.91)

∂f

∂a

= −2



(y − a

− a

......a

=0, (17.92)

∂f

∂a

= −2



(y − a

− a

......a

) x

=0, (17.93)

.....................................................................

∂f

∂a

= −2



(y − a

− a

......a

) x

=0. (17.94)

There are n + 1 linear equations for (n + 1) unknowns a

......... ......a

These sets of equations in (17.91) to (17.94) can be written in the matrix

form as

⎡

⎢

⎣

−−−

′′ ′

−−−

′

′′ ′

−−−

′

′′ ′

−−−

′

−−−

⎤

⎥

⎦

⎡

⎢

⎣

′

⎤

⎥

⎦

⎡

⎢

⎣

′

⎤

⎥

⎦

. (17.95)

Equation (17.94) can b e written in the form

⎡

⎢

⎣

1111−−− 1

−−−x

′′′′

−−−

′

′′′′

−−−

′

′′′′

−−−

′

−−−x

⎤

⎥

⎦

⎡

⎢

⎣

1 x

−−−x

1 x

−−−x

1 x

−−−x

′

−−−

′

−−−

′

1 x

−−−x

⎤

⎥

⎦

⎡

⎢

⎣

′

⎤

⎥

⎦

⎡

⎢

⎣

′

⎤

⎥

⎦

(17.96)

Equation (17.95) can b e written in the matrix form

XA=X

Y (17.97)

⇒ A=(X

−1

Y. (17.98)

This is the mathematical expression for the least squares estimator.

586 17 Inversion of Potential Field Data

Alternatively, we can say that to ﬁnd the vector A which minimizes the

sum of the squared residuals ε

.Wewrite

= ε

ε =(AX− Y)

(AX −Y)

=∆G

∆G where ∆G = XA −Y (17.99)

A −A

Y − Y

XA + Y

Y. (17.100)

Diﬀerentiating the expressions with respect to A

and setting the result equal

to zero, we get

(

A = X

Y (17.101)

⇒ A=(X

−1

Y. (17.102)

17.9 R i dg e Regression Estimator

The expression for the least square estimator can be obtained from the sum of

the squared residuals. (X

−1

is also a generalized inverse of a rectangul ar

matrix. Hoerl and Kennard (1970a, 1970b) show that the rectangular matrix

X becomes nearly singular quite often because of the presence of zero and

very small eigen valu es in X

X. Marquardt (1963, 1770) Hoerl and Kennard

independently have shown that considerable amount of stability in the solution

can be o btained by adding a numerical coeﬃcient to the diagonal element of

the (X

X) matrix. This coeﬃcient is known as Marquardt’s coeﬃcient or

Marquardt – Levenberg coeﬃcient. In eﬀect the Marquardt’s coeﬃcient is

added to all the eigen values. It reduces th e instability considerably due to

the presence of zero and very small eigen values. So the least squares estimator

∆P = (X

−1

∆G (17.103)

changes to the form

∆P

∗

=(X

X+KI)

−1

∆G (17.104)

where K is the Marquardt’s coeﬃcient, I is the identity matrix. ∆P is the

model modiﬁcation vector X is the sensitivity matrix. ∆P

∗

is the model mod-

iﬁcation vector with a reduced rate. Equation (17.104) is known as the Ridge

Regression Estimator or Damped Least Squares Estimator. It is called the

damped least squares because the amplitude of the model modiﬁcation goes

down i.e., ∆P

∗

< ∆P (Fig 17.9). Ridge Regression Estimator is much more

stable than Least Squares estimator. It has both the qualities of Newton-

Rhapson method and gradient method. Newton Rhapson metho d converges

very fast if the starting value is close to the actual answer. The system diverges

when the initial guess is away from the real answer. In the gradient method,

however, the convergence is possible even if the initial guess is considerably

17.10 Weighted Ridge Regression 587

Fig. 17.9. Movements in the least square and damped least square i terative process

away from the actual answer. But convergence is very slow near the a ctual

answer. Ridge regression has qualities of both the approaches i.e., it converges

very fast near the actual answer and it’s radius of convergence is reasonably

high. It means even if the initial guess is poor i.e., the distance between the

Prior

and m

true

is high, r idge regression can drag the model towards the

actual answer. Larger the number of parameters, lesser will be the radius of

convergence. Data inadequacy and data inaccuracy has direct relation with

the radius o f convergence. Choice of the value of K is dependent upon the

interpreter. Starting value of K can be anything between 10.0, 1.00, 0.01,

0.001 as suggested by Marquardt (1963). But as the iterative solution con-

verges, the value of K must be successively lowered down till its value becomes

negligible. Many interpreters used variance – covariance values instead o f a

pure number as Marquardt’s coeﬃcient (Tarantola 1987, Menke, 1984).

17.10 Weighted Ridge Regression

In most of the scientiﬁc work we see that some of the experimental data in any

experiment are less reliable than the others. This is quite common in geophys-

ical ﬁeld data analysis. It means that the data variances are not all equal. In

other words the matrix Var (ε)(Variance(ε) is not in the form of Iσ

where I

is the identity matrix and σ

is the variance (square of the standard deviation)

in the data. But Var (ε) is diagonally dominated matrix with unequal diago-

nal elements. It happens in some problems that the oﬀ diagonal elements of

Var (ε) are not zero, i.e., the observations are correlated. When either or both

of these occur, the general least squares estimator (17.104) is not valid and it

is necessary to change the procedure for obtaining the estimator. Draper and

Smith (1968 ) suggested that one has to transfer the observation

Y=Xβ + ε (17.105)

to another variable Z in a diﬀerent plane which do satisfy the basic conditions

of linear regression and one can write

588 17 Inversion of Potential Field Data

Z=Qβ + f (17.106)

where E(f) = 0 and Var (f) = Iσ

. The variables Y and X in the original plane

will change to another set of variables Z and Q such that the E(ε)=0and

Var(ε)=Wσ

changes to E(f) = 0 and Var(f) = Iσ

.HereWistheweight

attached to each data. For transformation from one plane to the other, the

following procedure is adapted. It is possible to ﬁnd an unique non singular

symmetric matrix P such that

P=PP=P

=W. (17.107)

For transformation from one plane to the other we premultiply both the sides

of the regression equation by P

−1

.Letf=P

−1

(ε)suchthatE(f)=0then

E(ﬀ

) = var (f) when the mathematical expectations are taken separately for

every term in the square n × nmatrixﬀ

.Weget

var (f ) = E(ﬀ

)=E(φ

−1

εε

−1

). (17.108)

Since

(φ

−1

)

= φ

−1

we can then write

−1

E(εε

−1

⇒ P

−1

var (ε)P

−1

⇒ P

−1

Wσ

−1

⇒ P

−1

PPP

−1

(17.109)

=Iσ

. (17.110)

Thus if we premultiply (17.105) by P

−1

, We obtain a new model in a new

plane i.e.,

−1

Y=P

−1

Xβ +P

−1

ε (17.111)

Equation (17.111) is written as

Z=Qβ +f. (17.112)

It is now clear that if we apply the basic least squares theory to the (17.112),

since E (f) = 0 and var (f) = Iσ

, we get the normal equation as

Qb=Q

Z (17.113)

⇒ X

−1

Xb=X

−1

Y (17.114)

⇒ b=(X

−1

Y. (17.115)

This is the basic formulation of the weighted least squares inverse. The vari-

ance – covariance matrix is

var (b) = (Q

−1

=(X

−1

(17.116)

and the sum of the squared residual is