The goal of data anonymization is to derive as much information as possible from data about individuals without learning about the individuals themselves.
The goal of the Open GDA Score Project is to help answer two questions about any anonymization method: Utility: How much analytic value can one get from the anonymized data? Defense: How much does the anonymization protect individual privacy?
The defense criteria are those defined by the EU Article 29 Data Protection Working Party: singling-out, linkability, and inference.
The Open GDA Score Project is technical in nature, and aims to improve the state-of-the-art in evaluating and comparing data anonymization technologies. The GDA Score is unique in that it works for all anonymization schemes, therefore giving an apples-to-apples comparison. The methodology is empirical: real attacks are launched against real anonymization systems holding real data, and the success of the attacks is measured. The resulting GDA Score is a measure of the effectiveness of known attacks — it unfortunately tells us nothing about unknown attacks. Thus a goal of the Open GDA Score Project is to over time build up as complete a library of attacks as possible.
We envision many benefits from the project:
The GDA Score as currently defined is just a first attempt. We envision many improvements in the score. Likewise we have measured only a small number of anonymization technologies — pseudonymization, data masking, K-anonymization, Diffix, and a couple of differential privacy variants — with a small number of attacks. We need to build our library of anonymization schemes, attacks, data sets, and utility measures. We encourage participation, and invite interested parties to contact us and learn how they can help.
Please contact us if you have questions, comments, or would like to get involved. Note that many of the articles and resources allow comment posts.