As the diagnostic takes on higher values, the scatter of points migrates upward.
Those scoring in the middle 40s on the diagnostic, for example, have the highest
exam performance, on average. This upward trend is nicely represented by the
fitted line drawn through the center of the points. For now, the interpretation of
this line is that it is the best-fitting straight line through the scatter of points were
we to attempt to fit a straight line through them. (The meaning of best fitting is
explained in more detail below.)
As another example we consider the couples dataset. Partners’ attitudes about gen-
der roles can sometimes be a source of friction for a couple, particularly when it
comes to sharing housework or other necessary tasks. Family sociologists have there-
fore studied the factors that affect whether partners’ attitudes are more traditional as
opposed to more modern. One such factor, of course, is education. One would expect
that couples with more years of schooling would tend to have more modern, or egal-
itarian, attitudes toward gender roles compared to others. A series of items asked in
the NSFH allows us to tap gender-role attitudes. Four were included in the couples
dataset and were asked of each partner of the couple. An example item asked for the
extent of agreement with the statement: “It’s much better for everyone if the man
earns the main living and the woman takes care of the home and family.” Response
choices ranged from “strongly agree,” coded 1, to “strongly disagree,” coded 5. The
other items were all similarly coded so that the high value represented the most mod-
ern, or egalitarian, response. To create a couple modernism score, I summed all eight
items for both partners. The resulting scale ranged from 8 to 40, with the highest score
representing couples with the most egalitarian attitude. Figure 2.2 shows a scatterplot
of Y couple gender-role modernism plotted against male partner’s years of school-
ing completed (X) for the 416 couples in the dataset. The linear trend is again evident
in the scatter of points, as highlighted by the fitted line.
As a third example, I draw on the GSS98 dataset. Among the questions asked of
1515 adult respondents in this survey was one about sexual activity. In particular,
the question was: “About how often did you have sex during the last 12 months?”
Response choices were coded 0 for “not at all,” 1 for “once or twice,” 2 for “about once
a month,” 3 for “2 or 3 times a month,” 4 for “about once a week,” 5 for “2 or 3 times
a week,” and 6 for “more than 3 times a week.” Although this variable is not truly con-
tinuous, it is ordinal, and has enough levels—five or more is enough—to be treated as
“approximately” continuous. This approximation is especially tenable if n is large, as
it is for these data, and if the distribution of the variable is not too skewed. Regarding
the latter condition, the percent of people giving each response—0, 1, 2, 3, 4, 5, 6—
is 22.2, 7.5, 11.3,15.4, 18.3, 19.7, and 5.4, respectively. This represents an acceptable
level of skew.
What predicts sexual activity? Several obvious determinants come to mind, such
as age and health. But what about frequenting bars? There are several reasons why
those who frequent bars might be more sexually active than others. Some reasons
have to do with the selectivity of bar clientele and do not implicate bars as a cause of
sexual activity per se. For example, sexually active couples often go to bars or night-
clubs first before engaging in intimate activity. Also, those who go to bars more often
are most likely younger and possibly looking for sexual partners to begin with.
40 SIMPLE LINEAR REGRESSION