Select Page

1. The act of minimally perturbing individual-level data to decrease the probability of discovering an individualís identity. It involves masking direct identifiers (e.g., name, phone number, address) as well as transforming indirect identifiers that could be used alone or in combination to-identify an individual (e.g., birth dates, geographic details, dates of key events). If done correctly, de-identification is a defensible, repeatable, and auditable process that consistently provides assurance, based on generally accepted and repeatable statistical methodologies, that there is a very small risk of re-identification of any data that are released. 2. The use of one or more techniques designed to make it impossible — or at least more difficult — to identify a particular individual from stored data related to them. The purpose of data anonymization is to protect the privacy of the individual and to make it legal for governments and businesses to share their data without obtaining permission. Such data have proven to be very valuable for researchers, particularly in health care. Data anonymization methods include removing personally identifiable information (e.g., names, addresses, social insurance numbers, Medicare numbers, etc.), or using obfuscation methods such as encryption, hashing, generalization, pseudonymization, and perturbation. As governments move forward with open government initiatives, more data are becoming publicly available over the Internet. Much of these data have been scrubbed to create “limited datasets”. SYNONYM. Anonymization