Skip to content

Publication

Data linkage

Abstract

"Data linkage (also known as data matching, record linkage, or entity resolution) is the process of bringing together records that belong to the same unit across multiple databases in order to produce a single merged dataset. A unit may refer to, among other entities, a person, family, household, establishment, or company. Data linkage is often used in the context of merging records from large databases compiled from data collected by organizations in the public or private sector. Government agencies, health care providers, and private companies collect large amounts of data, including tax and financial records, health care records, consumer transactions, census records, and survey data. Joining these data types can be beneficial to researchers, policy makers, and businesses by allowing deeper analyses and greater insights to be drawn from the merged datasets than would be possible from a single data source. Specific purposes of data linkage include, for example, identifying duplicates in a record system (e.g., population register, census counts) and creating disease registries and health surveillance systems. Data linkage also allows enriching survey data with administrative data and other data sources (e.g., to make the linked data available to the research community). Furthermore, data linkage can facilitate follow-up and tracing efforts in surveys to obtain information on individuals about their residential status, cause of death, or other important outcome information relevant to the study. This entry elaborates on methods used in linking records belonging to the same entity. Other methods exist for matching records belonging to different entities, known as statistical matching or data fusion, the details of which can be found elsewhere (see, e.g., D’Orazio et al., 2006). The entry begins by offering a brief history of data linkage. This is followed by a thorough examination of the steps involved in the extended data linkage process. The entry concludes by highlighting some other data linkage topics, including an overview of privacy-preserving record linkage methods, of software options and of data linkage centers available to assist researchers in conducting data linkage." (Text excerpt, IAB-Doku) ((en))

Cite article

Antoni, M. & Sakshaug, J. (2020): Data linkage. In: P. A. Atkinson, A. Cernat, S. Delamont, J. Sakshaug & R. A. Williams (Eds.) (2020): SAGE Research Methods foundations, p. 1-18. DOI:10.4135/9781526421036931838