STATA FUN: R-squared for linear multilevel models

Aug 8, 2012

R-squared for linear multilevel models

Explained variance in multilevel modeling is an issue that strongly differs from explained variance in single-level OLS regression. Whereas OLS regression comes with the long-established coefficient of determination R², matters are somewhat more complicated in multilevel modeling.

Hierarchical models partition the total variance into several components, thus matters become more complex. Several pseudo-R² coefficients have been proposed, the most popular being probably A) one that builds on proportional reductions in variance and B) one that builds on proportional reductions in prediction error.

The straightforward approach that treats reductions in variance components (e.g. `sigma_(u0)^2` and `sigma_(e0)^2`) as increases in explained variance has been suggested by Raudenbush and Bryk (2002) or Singer and Willett (2003). The random intercept model without any explanatory variables (the "null model") provides two variance components which are then interpreted as unexplained variance at the individual (`sigma_(e0)^2`) and the higher-order (`sigma_(u0)^2`) level. Once explanatory variables are introduced to the model, the variance components change in value, thus a simple calculation along the lines of

`R^2 = ((sigma_(e0|b)^2 - sigma_(e0|m)^2)/ sigma_(e0|b)^2)`

`sigma_(e0|b)^2` refers to the lower-level residual variance of the baseline model, `sigma_(e0|m)^2` denotes the said parameter estimate in the model that includes explanatory variables. (This notation is taken from Hox 2010, p. 71.)

While this seems to be a straightforward approach, several complications and differences from OLS regression arise: Firstly, we obtain more than one single R²-value; instead we have one value per level/variance component. Secondly, adding predictor variables can lead to reductions in explained variance and even to negative R²-values. This is often observed when a null model is compared to a model that includes only group mean-centered predictor variables. Snijders and Bosker (1994) give a detailed account of situations when higher-level variance components go up in value when lower-level predictors are added to the equation. Thirdly, the situation gets even more complicated when random slopes are added to the model. Hox (2010, p.73) gives an example for this case and stresses the importance of centering in multilevel modeling.

Snijders and Bosker (1994) have suggested a way to calculate a pseudo-R² via proportionate reductions in prediction error to overcome the second shortcoming: Explained variance should not increase when predictors are added to the model. In order to achieve this, they focus on the total estimated variance `sigma_(u0)^2 + sigma_(e0)^2` for calculating their pseudo-R²'s. Snijders and Bosker calculate the level-1 coefficent as follows:

`R_1^2 = 1 - ((sigma_(u0|m)^2 + sigma_(e0|m)^2))/((sigma_(u0|b)^2 + sigma_(e0|b)^2))`

and the level-2 coefficient in the following fashion:

`R_2^2 = 1 - (((sigma_(u0|m)^2)/(n_j) + sigma_(e0|m)^2))/(((sigma_(u0|b)^2)/(n_j) + sigma_(e0|b)^2))`

Where `n_j` stands for average group size (Snijders and Boskers recommend using the harmonic mean).

Alexander Schmidt and Katja Möhring have released the mlt package comprising several multilevel tools, enabling users to calculate both the Bosker & Snijders (1994) and Raudenbush & Bryk pseudo-R²'s after xtmixed using their command mltrsq. The package cannot be found on SSC, but needs to be downloaded here.

A brief test:

webuse pisa2000.dta, clear
xtmixed isei female high_school college one_for both_for test_lang pass_read ///
             || id_school:, var
mltrsq, full

gives the following results:

Level 2 variable is id_school 
 
Calculating R-squared for the parameters of
 female high_school college one_for both_for test_lang pass_read and _cons
   
Level 2 variable is id_school
   
Number of macro-units:   148
   
Harmonic mean of the level-2 group sizes:  8.29
   
Random-effects Parameters of complete model:  
   Residual-varianz level 1: 222.3588
   Residual-varianz level 2: 24.7574
   
Random-effects Parameters of Null-model:  
   Residual-varianz level 1: 259.0281
   Residual-varianz level 2: 51.7011
   
   
Snijders/Bosker R-squared Level 1:  0.2047
Snijders/Bosker R-squared Level 2:  0.3782
   
Bryk/Raudenbush R-squared Level 1:  0.1416
Bryk/Raudenbush R-squared Level 2:  0.5211

Differences between the two approaches seem sizable; however, given the fact that our example only draws on level-1 predictors, this is a rather extreme case. But in general I have a feeling that Kreft and De Leeuw (1998, p. 119) are correct in their suggestion not to use any pseudo-R²'s in multilevel modeling.

Roberts et al. (2011) however suggest three new measures of explained variance in multilevel models which are not yet readily implemented in any Stata commands.

References

Kreft, Ita, and Jan de Leeuw. 1998. Introducing Multilevel Modeling. Sage.

Raudenbush, Stephen W., and Anthony S. Bryk. 2002. Hierarchical Linear Models. Applications and Data Analysis Methods, 2nd ed. Sage.

Roberts, J. Kyle, James P. Monaco, Holly Stovall, and Virginia Foster. 2011. "Explained Variance in Mulitlevel Models." Pp. 219-230 in Handbook of Advanced Multilevel Analysis, edited by Joop J. Hox and J. Kyle Roberts. Routledge.

Singer, Judith D., and John B. Willett. 2003. Applied Longitudinal Data Analysis. Modeling Change and Event Occurrence. Oxford University Press. doi: 10.1093/acprof:oso/9780195152968.001.0001

Snijders, Tom, and Roel Bosker. 1994. "Modeled Variance in Two-Level Models." Sociological Methods and Research 22(3):342-363. doi: 10.1177/0049124194022003004

Snijders, Tom, and Roel Boskers. 1999. Multilevel Analysis. An Introduction to Basic and Advanced Multilevel Modeling. Sage.