Applying Ridge Regression to
Admissions Data by Race and
Sex
Russell D. Miars
Given these times of diminishing enrollments, it is increasingly important for colleges and universities to select students who have the greater likelihood in succeeding. The standard method of predicting success at any institution is a regression equation based on past academic achievement, typically high school grades, and present academic ability, usually measured by a standardized test such as the SAT or the ACT. The problem with this procedure is that since these measures are so highly intercorrelated (high multicollinearity), the resulting least squares regression equations suffer from a relatively high degree of error variance. (Darlington, 1978). Cross-validations of these equations typically yield low levels of prediction.
A recent alternative to least squares
regression is ridge regression (Darlington, 1978; Dempster, Schatzoff &
Wermuth, 1977; Hoerl & Kennard, 1970; Prince, 1977). Ridge regression was
developed expressly for the purpose of circumventing the weakness of least
squares regression with regard to highly overlapping predictors. The typical
measures used in collegiate admissions are highly interrelated, and as such,
applying ridge regression would appear to be very appropriate.
Ridge regression is similar to least
squares regression except that a small constant value is added to the main
diagonal of the variance-covariance matrix prior to the determination of the
regression equation. The exact means of determining the regression equations
are identical in these two procedures, following this addition to the
variance-covariance matrix done in the ridge procedure. In effect, what this
alteration of the data achieves is a new set of data that has a lower degree of
multicolliearity, a better fit of the regression equation to the actual data is
achieved as the mean square error is reduced.
Obviously, a key aspect of ridge
regression is determining what the best value of the constant that is added to
the main diagonal of the variance-covariance matrix is to maximize prediction.
The typical way of determining this maximal constant value (delta) is by using
an iterative procedure: adding in a series of possible delta values and seeing
the effect on the regression equation. The best delta value is the one
associated with the lowest mean square error for the equation. Using
successively higher delta levels typically results in decreasing mean square
errors to a certain point where higher delta levels increase mean square
errors. The specific delta value that yielded the lowest man square error is
the delta value that is to be used. The regression equation associated with
this value yields the maximal predictive power.
It is important to realize that the
resulting ridge regression equation is a biased estimate and not reflective of
population parameters. As such, ridge regression is of little use in
theoretical modeling (Darlingon, 1978). The main advantage of ridge regression
is in prediction, and as this is the specific purpose of using regression
equations in selection, ridge regression would appear to be a particularly
useful tool in admissions.
The purpose of this study was to examine
the improvement in prediction of the ridge regression procedure over the least
squares regression procedure when applied to admissions data. Improvement of
prediction was defined in terms of the shrinkage of the multiple correlation
coefficients obtained when the regression equations were cross-validated on
another sample. Further, as it has been demonstrated that separate regressions
equations for eaxh race and sex are desirable in selecting students (Farver,
Sedlacek & Brooks, 1975), the above research question was examined
separately for each race/sex group. It was expected that the ridge regression
procedure would result in less coefficient shrinkage than the least square
procedure when cross-validated.
The sample and data used in this study were
the same used by Farver, Sedlacek and Brooks (1975). All black freshmen
students entering a large stat university in the fall of 1968 (N = 126) and the
fall of 1969 (N = 133) who had complete data (high school GPA and SAT scores)
and who complete the freshman year were included in this study. Samples of
white students were randomly drawn as a comparison group.
It was decided to examine the
predictability of freshman grades from high school GPA, SAT Math and SAT Verbal
scores separately for each of the race/sex groups (black males, black females,
white males and white females). As ridge regression is purported to be most
useful given small sample sizes, random subsamples of 25 were drawn from each
of the above four groups for each of the sample years (1968 and 1969). So a
total of eight subsamples of 25 were drawn and least squares and ridge
regression equations were performed on each. The resulting regression equations
were then cross-validated on the full, corresponding sex/race sample from the
other sample year to get a predicted freshman cumulative GPA for each
individual. These predicted cumulative GPA’s were then correlated with the
actual freshman year cumulative GPA. The difference between the correlation of
the original equation and the correlation of the cross-validation was the
shrinkage examined.
It was hypothesized that the
cross-validation shrinkage for the ridge regression would be less than the
shrinkage for the least squares regression.
The summary of the multiple correlations
based on the subsample (N = 25) regressions (least squares and ridge) and the
cross-validated correlations are presented in Table I. As can be seen from
Table I, ridge regression was as good as, if not better than, least squares regression
in reducing shrinkage. The shrinkage associated with the least squares
equations were vertically identical with the shrinkage associated with the
ridge equations in six of the eight validations. The two exceptions to this
were the use of the 1968 white male equations on the 1969 white males and the
1969 black male equations on the 1968 black males. In each of these cases there
was less shrinkage associated with the ridge regression equations.
The results of this study were not as strong as hypothesized. There was relatively little difference in the prediction of college grades between the least squares and ridge regression techniques. Ridge regression yielded similar or slightly better results compared to the least squares regression. Given the greater effort required to apply ridge regression, its usefulness in helping to select students may be of limited value if it is applied only to the predictors used in this study, high school GPA and SAT scores.
A possible reason for the lack of results
in this study could have been the relatively small size of the ratio of the
number of predictors (p) to the sample (n) used in obtaining the regression
equations. In this study, the p/n ratio was 3/25. It has been found that when
the p/n ratio is too small, there really is no difference between the least
squares and risge regression equations. But where this p/n ratio is large,
i.e., many predictors with a small sample, ridge regression has been
demonstrated to be more accurate than least squares regression (Darlington,
1978; Dempster, Schatzoff & Wermuth, 1977; Faden, 1978). So ridge
regression might be a valuable tool if more predictors of collegiate success
were included in the regression equations. Some particularly valuable predictors
to include in predicting collegiate success are non-cognitive predictors, such
as those suggested by Sedlacek and Brooks, (1976). These seven non-cognitive
dimensions have some variance overlap with the above cognitive predictors, but
also contribute some unique variance with collegiate success (Tracey and
Sedlacek, 1982). If these variables were used along with HS GPA and SAT scores,
the ridge regression should yield less cross-validated shrinkage than least
squares solutions.
If schools use only the traditional
cognitive predictors of success, as used here, ridge regression is only a
minimal improvement over the usual least squares regression in terms of
predicting success (Sedlacek & Brooks, 1976; Tracey & Sedlacek, 1980),
and if these measures are included as predictors, ridge regression appears to
be a viable alternative to least squares regression. At worst, ridge regression
yields similar results; at best, it is a vast improvement in prediction power
over least squares regression.
Finally, ridge regression may be valuable
to use as it typically does not require the large sample sizes that least
squares regression does. The process of collecting data for a large sample is
and time consuming. Ridge regression may enable schools to gather a smaller
sample of data without sacrificing predictive power.
Darlington,
R.B. “Reduced-Variance Regression,” Psychological
Bulletin, 1978,
85, 1238-1255.
Dempster,
A.P., Schatzoff, M., & Wermuth, N. “A Simulation Study of Alter-
natives to Ordinary Least Squares,” Journal of the American Statistical Asso-
ciation, 1977, 72, 77-91.
Faden,
V.B. Shrinkage in Ridge Regression and
Ordinary Least Squares Mul-
tiple Regression Eliminators. Unpublished doctoral dissertation,
University of
Maryland, 1978.
Farver,
A.S., Sedlacek, W.E., & Brooks, Jr., G.C. “Longitudinal Predictions
Of University Grades for Blacks and
Whites,” Measurement and Evaluation in
Guidance, 1975, 7, 243-250.
Orthogonal Problems,” Technometrics, 1970, 12, 69-82.
Price,
B. “Ridge Regression: Application to Non-Experimental Data,” Psych-
logical Bulletin, 1977, 84, 759-766.
Sedlacek,
W.E. & Brooks, G.C., Jr. Racism in
American Education: A Model
For Change. Chicago: Nelson-Hall, 1976.
Tracey,
T.J. & Sedlacek, W.E. “Conducting Student Retention Research,”
NASPA (National Association of Student
Personnel Administrators) Journal
Field Report, 1981, 5, 5-6.
Tracey,
T.J. & Sedlacek, W.E. Noncognitive
Variables in Predicting Academic
Success by Race. Counseling Center Research Report # 1-82. University of
Maryland, College Park, 1982.
Multiple Correlation
Coefficients Obtained Using Least Squares and
Ridge Regression by Race and
Sex Equations
Least
Least Cross-
|
1968 Black Males |
.74 |
.75 |
.15 |
.42 |
.41 |
58 |
|
1968 Black Females |
.50 |
.50 |
.75 |
.55 |
.56 |
75 |
|
1968 White Males |
.71 |
.70 |
.10 |
.45 |
.51 |
70 |
|
1968 White Females |
.84 |
.84 |
.19 |
.49 |
.49 |
52 |
|
1969 Black Males |
.56 |
.56 |
.25 |
.48 |
.53 |
64 |
|
1969 White Males |
.59 |
.59 |
.50 |
.62 |
.61 |
78 |
|
1969 White Females |
.68 |
.67 |
.30 |
.68 |
.68 |
66 |