Nov 24, 2021

Decomposing the difference between two means

// GSS 1990-2004
use year race sibs reg16 educ maeduc if inrange(year, 1990, 2004) ///
  & year != 2002 ///  // 2002 measures race in a non-standard way
    using "gss7221_r1.dta", clear

// In Table 7.8 Treiman states that he's got 17,090 cases (14,985 non-black + 
// 2,105 black). There seems to be no way I can get there with the data
// because maeduc, mother's education, only has 15,996 valid observations.

// The reason for this seems to be that in his do-file, he starts with
// the 1990 GSS file and then appends all other GSS year files -- including
// the 1990 file, thus including the 1990 file twice. Running the do-file with

// expand 2 if year == 1990

// uncommented (almost) replicates the numbers reported in the book.

drop year // Not needed

// Race variables
generate black    = (race == 2)
generate nonblack = !black
drop race

// Truncate number of siblings at 15
replace sibs = 15 if sibs > 15  & !missing(sibs)
label var sibs "Sibsize"

// South
gen south = (inrange(reg16, 5, 7))
label var south "Southern origin"
drop reg16

// Education
label var educ "Education"
label var maeduc "Mother's education"

// Listwise deletion
mark touse if !missing(educ, maeduc, sibs)
keep if touse

// Table 7.8a eststo clear local vlist educ maeduc sibs south local upper local lower `vlist' foreach v of local vlist { estpost correlate `v' `lower' if !black local nnonblack = e(N) foreach m in b rho p count { matrix `m' = e(`m') } if "`upper'"!="" { estpost correlate `v' `upper' if black local nblack = e(N) foreach m in b rho p count { matrix `m' = e(`m'), `m' } } ereturn post b foreach m in rho p count { quietly estadd matrix `m' = `m' } eststo `v' local lower: list lower - v local upper `upper' `v' } // Table 7.8b esttab using table2.tex, replace nonumbers noobs mtitles not booktabs label nostar /// title(Correlations between study variables: Blacks (\emph{N} = `nblack') above, non-blacks (\emph{N} = `nnonblack') below the diagonal) /// mtitle("Education" "Mother's education" "Sibsize" "Southern origin") eststo clear eststo: estpost sum educ maeduc sibs south if black eststo: estpost sum educ maeduc sibs south if !black esttab using table1.tex, booktabs replace /// cells("mean(label(Mean) fmt(2))" "sd(label(SD) fmt(2) par)") /// label mtitle("Blacks" "Non-Blacks") /// title("Means and standard deviations of study variables")
// Table 7.9 eststo clear eststo: regress educ maeduc sibs south if touse & black eststo: regress educ maeduc sibs south if touse & !black esttab using table3.tex, booktabs replace /// b(2) se(2) r2(2) /// label mtitle("Blacks" "Non-Blacks") /// title("Coefficients of a model of years of schooling, for blacks and non-blacks, US adults, 1990--2004") // Table 7.10 eststo clear eststo: oaxaca educ maeduc sibs south if touse, by(black) detail noisily eststo: oaxaca educ maeduc sibs south if touse, by(nonblack) detail noisily esttab using table4.tex, booktabs replace /// b(2) nose not label mtitle("Blacks as reference" "Non-Blacks as reference") /// eqlabel("\emph{Overall}" /// "\emph{Differences in assets}" /// "\emph{Differences in returns to assets}" /// "\emph{Interactions}") /// coeflabels(overall:difference "Difference in years of schooling" /// overall:endowments "Total due to difference in assets" /// overall:coefficients "Total due to difference in returns" /// overall:interaction "Total due to interactions") /// drop(overall:group*) /// varwidth(30) alignment(D{.}{.}{-1}) /// title("Decomposition of the difference in the mean years of schooling by blacks and non-blacks, US adults, 1990--2004")