Record linkage is a technique for integrating data from sources or providers where direct access to the data is not possible due to security and privacy considerations. This is a very common scenario for medical data, as patient privacy is a significant concern. To avoid privacy leakage, researchers have adopted k-anonymity to protect raw data from re-identification however they cannot avoid associated information loss, e.g. due to generalisation. Given that individual-level data is often not disclosed in the linkage cases, but yet remains potentially re-discoverable, we propose semantic-based linkage k-anonymity to de-identify record linkage with fewer generalisations and eliminate inference disclosure through semantic reasoning.
Keywords: Medical record linkage; de-identification; k-anonymity; semantic reasoning.