Chapter 15
Relating a Measurement Variable to Another
Measurement Variable
Looking at the Broad Picture ...................................................................... 200
Linear Relationships................................................................................ 201
The Best-Fit Straight Line ......................................................................... 204
Prediction ........................................................................................... 207
How Good Is the Best Fit? ......................................................................... 209
Significance and Confidence ....................................................................... 211
Analysis of Residuals .............................................................................. 213
Assumptions and Robust Methods ................................................................ 217
Practice.............................................................................................. 220
In Chapters 12 and 13 we investigated the relationship between a measurement vari-
able and a categorical variable. We took two approaches to this task. The first was
to estimate population means for the measurement variable in each of the cate-
gories of the categorical variable and attach error ranges to those estimates. The
second approach was to use either a two-sample t test (if only two categories were
involved) or an analysis of variance (if more than two categories were involved). In
Chapter
14 we investigated the relationship between two categorical variables. Once
again we took two approaches. The first was to estimate population proportions for
one of the variables in each of the categories of the other variable and attach error
ranges to those estimates. The second approach was to use a chi-square test to eval-
uate significance and Cramer’s V to evaluate strength of association. There remains
only to investigate the relationship between two measurement variables to complete
all the possible combinations, and that is the subject of this chapter. We will see
that one approach here is so powerful that we will not really consider alternative
approaches.
Table
15.1 provides an example set of data consisting of observations on 14
known sites of the Oasis phase in the R´ıo Seco valley. At each of the sites a system-
atic program of surface collection was undertaken to produce a sample of exactly
100 artifacts. After careful consideration of sources of bias we decide that we are
willing to work with this sample of sites as if it were a random sample. Similarly
considering sources of bias for the artifact collections, we decide we are willing
to treat each as if it were a random sample of artifacts on the surface. Since each
collection consists of 100 artifacts, the number of hoes in each is the percentage of
hoes in the collection, and simultaneously our best approximation of the percentage
R.D. Drennan, Statistics for Archaeologists, Interdisciplinary Contributions
to Archaeology, DOI 10.1007/978-1-4419-0413-3
15,
c
Springer Science+Business Media, LLC 2004, 2009
199