Privacy preserving publishing of set-values Full text

Manolis Terrovitis
Privacy Objervatory Magazine, 2, 2011

Set-values are a natural representation of data that appears in a large number of application areas, ranging from retail sales logs to medical records. Datasets with set-values are sparse multidimensional data that usually contain unique combinations of values. Publishing such datasets poses great dangers to the privacy of the individuals that are associated with the records. Adversaries with partial background knowledge can use it to retrieve the whole record that is associated with an individual.

The aim of this article is to provide an overview of the basic anonymization methods for set-valued data. Anonymization methods allow sanitizing set-valued datasets in such a way that the privacy of the individuals who are associated with the data is guaranteed. The paper describes the basic privacy guaranties that are used in research literature, and briefly presents the state-of-the-art anonymization methods.