Catchment Basin Analysis of Stream Sediment Anomalies 117
ESTIMATION OF LOCAL UNI-ELEMENT BACKGROUND DUE TO LITHOLOGY
Two techniques are explained here: (1) multiple regression analysis; and (2) analysis
of weighted mean uni-element concentrations. The former is demonstrated by Bonham-
Carter and Goodfellow (1984, 1986), Bonham-Carter et al. (1987) and Carranza and
Hale (1997), whilst the latter is demonstrated by Bonham-Carter et al. (1987).
Multiple regression analysis
In multiple regression analysis, measured stream sediment uni-element
concentrations (Y
i
) and areal proportions (X
ij
) of j (=1,2,…,m) lithologic units in sample
catchment basin i (=1,2,…,n) are used, respectively, as dependent and independent
variables in order to estimate for every sample catchment basin local background uni-
element concentrations (
i
Y
′
) due to lithology in sample catchment basin i, thus:
ij
m
j
joi
XbbY
¦
=
+=
′
1
, (5.4)
where
1
1
=
¦
=
m
j
ij
X , b
o
and b
j
are the regression coefficients determined by the least-
squares method to minimise the quantity
¦
=
′
−
n
i
ii
YY
1
2
)( . The multiple regression equation
implies that estimates of background uni-element concentrations are a result of additive
mixing of weathering products of lithologic units in sample catchment basins.
The coefficient b
o
can be interpreted as regional average uni-element content,
whereas the coefficient b
j
can be interpreted as average uni-element content of lithologic
unit j (=1,2,…,m) in any sample catchment basin i (=1,2,…,n). However, by inclusion of
b
o
, equation (5.4) is indeterminate because the regression matrix is singular, unless one
independent variable is discarded (Bonham-Carter et al., 1987). This problem can be
overcome by allowing round-off errors (e.g., using two decimals) in calculating areal
proportions of lithologic units so that
00.1
1
≅
¦
=
m
j
ij
X
. The multiple regression modeling
can also be forced through origin (i.e., setting b
o
=0) so that the singularity problem is
avoided and equation (5.4) is determinate (Bonham-Carter and Goodfellow, 1984, 1986).
In order to determine relative contributions of the independent variables and their
ability to account for total variation in Y
i
, the multiple regression analysis is performed
via forward and forced simultaneous inclusion of independent variables. That is to say,
the most significant independent variables are not searched and included in the final
regression equation according to a statistical criterion; rather, all independent variables
are included in the final regression model.
The ability of the independent variables to account for the variation of the dependent
variables can be characterised using R
2
(usually expressed as percentage), the ratio of
sum of squares explained by regression to the total sum of squares, which indicates
goodness-of-fit of the multiple regression model. Invariably, regression models have
poor fit to uni-element concentration data that are significantly positively skewed.
Logarithmic transformation of uni-element concentration data invariably results in