The impact of cleansing procedures for overlaps on estimation results
"Process-generated and administrative datasets have become increasingly important for labor market research over the past ten years. Major advantages of these data are large sample sizes as well as absence of retrospective gaps and unit non-responses. Nevertheless, the quality and validity of the information remains unclear and a lot of preparation and data cleansing is necessary before the data are analyzable. Unfortunately, only few researchers provide access to their cleansing procedures and therefore, also the impact of them on the results of the analyses is unidentified. This paper contributes to this subject and focuses on the variation of research results due to alternative data cleansing procedures. In particular, the paper uses the framework for data preparation suggested in an evaluation study by Wunsch and Lechner (2008) as a benchmark and then induces variation by developing different cleansing procedures for overlapping and parallel observations. The descriptive results show that the differences between the data sets (based on the different procedures) show various magnitudes on some attributes concerning time and personal characteristics. Similar results appear for the subsequent analysis of the treatment effects, which do not vary in the overall shape but in the magnitude especially during the lock-in effect. In sum the results of the analysis indicate that the empirical findings of the evaluation method are fairly robust to variations in the underlying cleansing procedure." (Author's abstract, IAB-Doku) ((en))
Scioch, Patrycja (2010): The impact of cleansing procedures for overlaps on estimation results * evidence for German administrative data. (FDZ-Methodenreport, 04/2010 (en)), Nürnberg, 32 p.