* FRSTEX.D0 * A sample program to illustrate the Censored Quantile Regression * Program qcenreg.ado by Rob Vigfusson * Normally one would read in the data using the infile command * infile y x z using dat2.raw * For demonstration purposes, I am going to generate some data instead. * See the program scndex.do for another example that reads in data. clear set obs 500 /* Sample Size */ set seed 12 /* Random Number Generator Seed */ * I first generate some data * The explanatory variable is x * The dependent variable is y generate x = 20+50*uniform() generate y = -15 + x*0.45+400*(1/x)*invnorm(uniform()) *The errors are heteroscedastic. The variance decreases with x. * I now create censored data where all observations that are negative * are dropped generate ycen = cond(y<0,0,y) * I add some labels to y and ycen label variable ycen "Censored Data" label variable y "Original Data" * Here is a graph of the original and censored data. * Note how the variance decreases with x and that the censoring * is mostly associated with small values of x. graph y ycen x * So for high quantile values, the censoring does not matter. qreg y x , quantile(0.75) predict pred75 qreg ycen x , quantile(0.75) predict pred75c label variable pred75 "True 75%" label variable pred75c "Censored 75%" graph pred75 pred75c x * But for lower quantile values the censoring does matter. qreg y x , quantile(0.35) predict pred35 qreg ycen x , quantile(0.35) predict pred35c label variable pred35 "True 35%" label variable pred35c "Censored 35%" graph pred35 pred35c x * So this is where the censored regression program comes in qcenreg ycen x , quantile(0.35) predict pred35cc label variable pred35cc "Corrected Censored 35%" graph pred35 pred35c pred35cc x * The Corrected Value is much closer to the truth than the uncorrected. * The reported standard errors however are too small as they don't take * account of the exclusion of censored values. * Therefore we bootstap to get correct standard errors. bs "qcenreg ycen x , quantile(0.35)" "_b[x] _b[_cons]" * If you want to do a large number of replications, I recommend * reducing the max iterations variable to something manageable. Otherwise this takes * way too long for a large number of replications. bs "qcenreg ycen x , quantile(0.35) maxiter(30)" "_b[x] _b[_cons]", dots reps(500) saving(bsdat1) replace /* The bootstrapped values of the coefficients are saved in the dataset bsdat1. You can examine that dataset to examine the values. (Make histograms etc.) We'll do that in just a minute. */ /* First. Note that the program can not work well for very low quantile values As there are too few observations to make a estimate. Therefore it just estimates uncorrected estimate. The procedure does beep and give you a red warning message. */ qcenreg ycen x , quantile(0.05) /* You would also have the same problem if the censoring is a greater percentage of the data. */ generate ycen2 = cond(y<10,10,y) qcenreg ycen2 x , quantile(0.25) *If you bootstrap at such a dataset you can see that there are mass points at 0 for beta(x) * bs "qcenreg ycen2 x z , quantile(0.15)" "_b[x] _b[_cons]", dots reps(50) saving(bsdat2) * You would also hear a large number of beeps as you are warned about your sample. * Looking at the distribution * Our results were use bsdat1, clear sum * We can look at the histogram. graph bs1