Pseudonymization, Column Suppression

The column-suppressed pseudonymized tables are generated by simply deleting columns that contain Personally Identifying Information (PII).

Each of the “raw” tables (raw_taxi, raw_banking, raw_census, and raw_scihub) has a pseudonymized version. These are called pseudo_taxi, pseudo_banking, pseudo_census, and pseudo_scihub respectively.

All of the tables have certain common columns. In each pseudonymized table, the following common columns were deleted:

lastname, firstname, birthdate, ssn, email, and street.

In addition, the account_id and birth_number were deleted from all of the tables in raw_banking.

The data can be explored via SQL client at

The K-anonymized tables are generated by applying K-anonymity to all columns.

