Evaluating the Pseudo Likelihood Approach for Synthesizing Surveys Under Informative Sampling

Abstract

"In recent years, national statistical organizations have increasingly relied on synthetic data when releasing microdata containing sensitive personal or establishment information. This paper deals with the challenges of using synthetic data to protect the privacy of survey respondents. For this type of data it is often important to consider the survey design information when creating the synthesis models. The paper discusses two techniques that can be used for generating survey microdata under informative sampling. Specifically, it examines an approach that combines design-based and model-based methods through the use of the pseudo-likelihood approach within the sequential regression framework. As far as we are aware, the pseudo-likelihood method has not been used in the context of sequential regression synthesis before. This method is compared with another approach in which design variables are included as predictors in the regression models. In the latter approach, the survey weights have to be synthesized and included in the final data product, while the former generates synthetic simple random samples that are representative of the original population without weights." (Author's abstract, IAB-Doku, © Springer) ((en))

Cite article

Oganian, A., Drechsler, J. & Iqbal, M. (2024): Evaluating the Pseudo Likelihood Approach for Synthesizing Surveys Under Informative Sampling. In: J. Domingo-Ferrer & M. Önen (Hrsg.) (2024): Privacy in Statistical Databases 2024, p. 129-143. DOI:10.1007/978-3-031-69651-0_9

DOI

doi.org/10.1007/978-3-031-69651-0_9