egen pickone = tag(group) // Create a tag variable "pickone" that has a value of 1 for a single, randomly // chosen case per group and 0 for all other cases center grouplevelpredictor if pickone // This then centers the variable "grouplevelpredictor" around the grand mean // at the group level due to the if-condition and creates a new variable // "c_grouplevelpredictor." The problem is now only that this variable has // missing values for all cases where "pickone" does not equal 1 bysort group (c_grouplevelpredictor): /// replace c_grouplevelpredictor = grouplevelpredictor[1] // This command then replaces the missing values per country with the centered, // non-missing value. This works because the data are sorted by the variable // "c_grouplevelpredictor." The centered, non-missing value will be the first // one as it is – at least in Stata – always smaller than the missing values.
Apr 14, 2014
Grand-mean centering variables on a higher level
It is good practice to center predictor variables in multilevel modeling. The -center- command is extremely useful in these situations. Especially grand-mean centering is an approach that is often necessary. However, when the variables to be centered are located on the group level and groups at this higher level are of unequal sizes, centering can become inaccurate, as bigger groups get a higher weight when calculating the grand mean. A way to deal with this problem is as follows:
Labels:
bysort,
center,
Multilevel modeling,
Subscripting