The significance of Sampling Design on Inference: An Analysis of Binary Outcome Model of Children’s Schooling Using Indonesian Large Multi-stage Sampling Data
This paper aims to exercise a rather recent trend in applied microeconometrics, namely the effect of sampling design on statistical inference, especially on binary outcome model. Many theoretical research in econometrics have shown the inappropriateness of applying i.i.dassumed statistical analysis on non-i.i.d data. These research have provided proofs showing that applying the iid-assumed analysis on a non-iid observations would result in an inflated standard errors which could make the estimated coefficients inefficient if not biased. Consequently, a policy-affecting quantitative research would give an incorrect – usually of type-1 errors – in its conclusion. Using a dataset sourced from the third cycle of the Indonesia Family Life Survey (IFLS), which sampling design involved multi-stage clustering and stratification, this paper shows discrepancies in the estimation result of probit regressions of a child attending school when the estimated standard errors are adjusted and not. The computation also shows a considerable change in the level of confidence in not-rejecting the null hypothesis of the explanatory variables. This paper provides more evidence that statistical analysis should always take into account the sampling design in collecting the data.