What is the difference between anonymizing personal data and their pseudonymization?


Both anonymization of personal data and pseudonymization are intended to keep personal data secret (secure). The GDPR does not apply to anonymized data, unlike pseudonymised personal data. So what distinguishes anonymization and pseudonymization of personal data?

Personal data anonymization

Anonymization is the permanent and irreversible removal of links between personal data and the data subjects. Its purpose is to prevent a person from being identified. Anonymized data is not personal data, as it is not associated with a specific person - anonymized data cannot be used to identify them.

Examples of anonymization:

  • Replacing the entire phrase with a single initial - e.g. replacing "Wrocław" with W., "Jacek" with J.

  • Replacing the entire phrase with a single initial followed by (...) - e.g. instead of Ford Fiesta DDD 9001200 license plate - Ford (...)

  • Replace the entire phrase with a pair of initials - e.g.instead of Jacek Nowak - J.N., when there are at least two people in a given collection with the initials J.N.

  • Replacing single words or longer strings of words with (...).

The GDPR does not apply to anonymised data as it concerns the protection of personal data of identifiable natural persons and as a result of anonymization it is not possible to identify the person concerned.

The anonymization is irreversible. After it has been made, it is not possible to identify which natural person the information concerns. And anonymous data is not protected. Therefore, anonymization is used on a large scale for statistical, scientific or other analysis purposes.

Personal data anonymization is often used in court judgments and tax interpretations, in which we can read, for example: "Interested party who is a party to the proceedings: Mr. W.B., PESEL ... or On June 19, 2012, S.J. The president of the Housing Cooperative (...) called R.M. from the cooperative's office, informing that the materials of the Supervisory Board for the general meeting were to be collected. On that day, he was on weekly duty from 6 p.m. to 8 p.m.'

Pseudonymization of personal data

Pseudonymization is a temporary "hiding" of personal data. Its feature is the reversibility of this process, as there is still a tool that allows you to read it. Personal data that are pseudonymised still remain personal data, because after applying the appropriate key to them, it can still be used to identify a specific person

The pseudonymisation of personal data only for some time "hides" the information that allows the identification of the data subject. For effective pseudonymization, it is important that the key that allows reading the data is stored separately, i.e. in a place other than the data itself. In addition, the key must be properly secured in a technical and organizational manner.

Examples of pseudonymization:

  • secret key encryption,

  • tokenization - replacing data fragments with a sequence of random numbers,

  • shortening - that is, shortening of selected values ​​so that reading their actual meaning becomes impossible.

Pseudonymization may be one of the methods of data security referred to in the GDPR regulation. It is worth bearing in mind here that it does not exclude other methods of data protection. It is one of the possible.

In life, we can encounter pseudonymization, for example, in studies, where indexes are assigned numbers, in competitions, where the data of the laureate announced to the public is replaced by a sequence of numbers, etc.

How to distinguish between anonymization and pseudonymization?

In order to distinguish the anonymization of personal data from pseudonymization, it should be analyzed whether it is still possible to identify a natural person after a given activity (anonymization or pseudonymization).

In order to determine whether a person is identifiable, consideration should be given to all likely means that are reasonably likely to be used by the controller or another person to identify the natural person directly or indirectly.

In order to determine whether a method can be used to identify an individual, all objective factors such as the cost and time needed to identify an individual should be taken into account, and the technology available at the time of data processing as well as technological advances should be taken into account.

To easily distinguish anonymization from pseudonymization, we present a tabular comparison:



there is no key to decrypt the classified data

there is a key that decrypts the classified data

permanent data deletion

The "deleted" data can be restored

does not allow the person to be identified

after applying the decryption key the person can be identified

anonymized data are not subject to the GDPR

The pseudonymised data is subject to the GDPR

As the result of anonymization is the permanent deletion of personal data allowing the identification of a natural person, the entrepreneur will not always be able to apply it. Then pseudonymisation comes to the rescue, in which personal data cannot be made available to everyone, but a specific group of people must know them in order to properly fulfill their obligations towards the person whose data is protected.