With the General Data Protection Regulation (GDPR) enforced by European Union logs have to be cleaned regularly to delete IP addresses and other information about visitors. This can be interpreted as a way to protect an emerging and discussed right, the right to be forgotten.
This new regulation is impacting every automated log system out of there. Since Sentry is a good open source error monitoring software* and it’s widely used, this guide will show how to clean Sentry logs on Linux systems according to GDPR using the sentry cleanup command line utility.
Set a time limit for logs
Before starting discover the maximum time limit a log can be kept according to the service policy you’re working on.
In the below examples, the max time a log can be kept is 26 months, one of the sizes proposed by Google Analytics on cleanup settings.
A 26 months limit for stored logs in sentry are set like this:
env SENTRY_CONF='/usr/local/etc/sentry' sentry cleanup --days 749
where /usr/local/etc/sentry is the directory where config.yml and sentry.conf.py are located or
env SENTRY_CONF='/usr/local/etc/sentry' sentry cleanup --days 749 --project 5
where 5 is the id of the project you can find in Project settings > Client Keys (DSN) as the very last part of the DSN path (always an integer number).
749 days are calculated like this:
30 days × 26 month = 780 days – 31 days = 749
31 days are a margin to safely delete logs the same day of each month.
Apparently, sentry cleanup needs to be root to access to postgres user and thus all sentry database tables so we have to put it on the cron for root.
Schedule the cleanup
- Login as root with su – or sudo bash
- crontab -e
- add a command line like this
. /usr/local/etc/virtualenvs/sentry/bin/activate && env SENTRY_CONF='/usr/local/etc/sentry' sentry cleanup --days 758 --project 5 && deactivate
leading dot . is an alternative for source available on /bin/sh (environment of cron) and not only by /bin/bash. This avoid to set the environment variable SHELL=’/bin/bash’ on crontab.
The resulting cron entry would be:
20 3 28 * * . /usr/local/etc/virtualenvs/sentry/bin/activate && env SENTRY_CONF='/usr/local/etc/sentry' sentry cleanup --days 749 --project 5 && deactivate
It isn’t a bad idea to add a fallback cleanup command the day after, so if you forget to cleanup logs for a specific project it will be done automatically:
20 3 29 * * . /usr/local/etc/virtualenvs/sentry/bin/activate && env SENTRY_CONF='/usr/local/etc/sentry' sentry cleanup --days 749 && deactivate
Now even your Sentry logs are GDPR compliant. The power of this method is that you can set a different cleanup limit for every project, according to its policies. And you haven’t to use any proprietary software to do this, just free/libre open source software.
If you are in a hurry to publish privacy policies and you have a dedicated hosting, give a try to JournaKit legalazy on GitHub.
* Plus it’s written on top of Django.