PostgreSQL Anonymizer

PostgreSQL Anonymizer
is an extension of the mask or replacement
personally identifiable information (PII) or commercially sensitive data from a Postgres database.
The project has a declarative method of anonymization. This means you can
declare the masking rules using PostgreSQL Data Definition Language (DDL) and specify your anonymization policy within the table definition itself.
The main purpose of this extension is to offer anonymization by design. We strongly believe that data masking rules should be written by the people developing the application because they have the best knowledge of how the data model works. Therefore masking rules must be implemented directly within the database schema.
Once the masking rules are defined, you can apply them using 5 different ones
masking techniques :
Each method has advantages and disadvantages. Different masking techniques can be used in different contexts. In any case, storing data directly within the PostgreSQL instance without using an external tool is important to limit exposure and the risks of data leakage.
Also, different Masking Functions available: randomization, faking, partial scrambling, shuffling, noise or even your own custom function!
Finally, the extension offers a panel of detection functions that try to guess which columns should be anonymous.
Quick Start
Step 0. Launch the project’s docker image
ANON_IMG=registry.gitlab.com/dalibo/postgresql_anonymizer
docker run --name anon_quickstart --detach -e POSTGRES_PASSWORD=x $ANON_IMG
docker exec -it anon_quickstart psql -U postgres
Step 1. Create a database and load its extension
CREATE DATABASE demo;
ALTER DATABASE demo SET session_preload_libraries="anon"
\connect demo
You are now connected to database "demo" as user "postgres".
Step 2. Create a table
CREATE TABLE people AS
SELECT 153478 AS id,
'Sarah' AS firstname,
'Conor' AS lastname,
'0609110911' AS phone
;
SELECT * FROM people;
id | firstname | lastname | phone
--------+-----------+----------+------------
153478 | Sarah | Conor | 0609110911
Step 3. Create the extension and activate the masking engine
CREATE EXTENSION anon;
ALTER DATABASE demo SET anon.transparent_dynamic_masking TO true;
Step 4. Specify a user mask
CREATE ROLE skynet LOGIN;
SECURITY LABEL FOR anon ON ROLE skynet IS 'MASKED';
GRANT pg_read_all_data to skynet;
Step 5. Define the masking rules
SECURITY LABEL FOR anon ON COLUMN people.lastname
IS 'MASKED WITH FUNCTION anon.dummy_last_name()';
SECURITY LABEL FOR anon ON COLUMN people.phone
IS 'MASKED WITH FUNCTION anon.partial(phone,2,$$******$$,2)';
Step 6. Connect the masked user
\connect - skynet
You are now connected to database "demo" as user "skynet"
SELECT * FROM people;
id | firstname | lastname | phone
--------+-----------+-----------+------------
153478 | Sarah | Stranahan | 06******11
Success Stories
With the PostgreSQL Anonymizer we have combined, from the design of the database, the principle that outside of production the data must be anonymous. Thus we can strengthen the GDPR rules, without affecting the quality of the tests during the version upgrade for example.
— Thierry Aimé, Office of Architecture and Standards at the French Public Finances Directorate General (DGFiP)
Thanks to the PostgreSQL Anonymizer we were able to define complex masking rules to implement full pseudonymization of our databases without losing functionality. Testing realistic data while guaranteeing the confidentiality of patient data is an important point to improve the stability of our functions and the quality of our customer service.
— Julien Biaggi, Product Owner at bioMérieux
I recently discovered your postgresql_anonymizer extension and my company uses it for anonymizing our users for local development. Great job!
— Max Metcalfe
If this extension is useful to you, please let us know!
‘s support
We need your feedback and ideas! Let us know what you think of this tool, how it fits your needs and what features are missing.
You can open an issue or send a message to contact@dalibo.com.
2025-01-14 10:36:00