Jul 7, 2017

Analyzing censored data with the Tobit model

This replicates Table 2.2 in Breen (1996). 

clear

// Simulate data
set obs 2000
generate ui = rnormal(0, 2)
generate xi = rnormal()
generate yi_star = 1 + 2*xi + ui
drop ui

// Fit OLS model
regress yi_star xi
eststo ols1
estadd scalar sigma = e(rmse)

// Truncate variable
generate yi = yi_star if yi_star > 0
replace  yi = 0       if yi_star <= 0
recode   yi (0 = .), gen(yi_h) 

// (A) OLS (using all observations on y
//     including y1 = 0)
regress yi xi
eststo ols2
estadd scalar sigma = e(rmse)

// (B) OLS (yi > 0) 
regress yi xi if yi >0
eststo ols3
estadd scalar sigma = e(rmse)

// (C) Heckman 2-step
heckman yi_h xi, select(xi) twostep 
eststo heckman

// (D) Tobit
tobit yi xi, ll(0)
eststo tobit
estadd scalar sigma = _b[sigma:_cons]

// Table 2.2
esttab ols2 ols3 heckman tobit ols1, b(3) se(3) nostar stat(sigma) /// mtitles("(A) OLS incl. yi = 0" /// "(B) OLS yi > 0" /// "(C) Heckman 2-step" /// "(D) Tobit" /// "OLS yi_star") /// coeflabel(_cons "alpha" xi "beta") /// collabels() /// drop(mills:lambda sigma:_cons) /// order(_cons xi) /// unstack /// modelwidth(20) nonumber

Reference

Breen, Richard. 1996. Regression Models. Censored, Sample Selected, or Truncated Data. Sage. doi: 10.4135/9781412985611