Automate log cleanup for GDPR: the Sentry case

With the General Data Protection Regulation (GDPR) enforced by European Union logs have to be cleaned regularly to delete IP addresses and other information about visitors. This can be interpreted as a way to protect an emerging and discussed right, the right to be forgotten.

This new regulation is impacting every automated log system out of there. Since Sentry is a good open source error monitoring software* and it’s widely used, this guide will show how to clean Sentry logs on Linux systems according to GDPR using the sentry cleanup command line utility.

Set a time limit for logs

Before starting discover the maximum time limit a log can be kept according to the service policy you’re working on.

In the below examples, the max time a log can be kept is 26 months, one of the sizes proposed by Google Analytics on cleanup settings.

A 26 months limit for stored logs in sentry are set like this:

env SENTRY_CONF='/usr/local/etc/sentry' sentry cleanup --days 749

where /usr/local/etc/sentry is the directory where config.yml and sentry.conf.py are located or

env SENTRY_CONF='/usr/local/etc/sentry' sentry cleanup --days 749 --project 5

where 5 is the id of the project you can find in Project settings > Client Keys (DSN) as the very last part of the DSN path (always an integer number).

749 days are calculated like this:

30 days × 26 month = 780 days – 31 days = 749

31 days are a margin to safely delete logs the same day of each month.

Apparently, sentry cleanup needs to be root to access to postgres user and thus all sentry database tables so we have to put it on the cron for root.

Schedule the cleanup

  1. Login as root with su – or sudo bash
  2. crontab -e
  3. add a command line like this
. /usr/local/etc/virtualenvs/sentry/bin/activate && env SENTRY_CONF='/usr/local/etc/sentry' sentry cleanup --days 758 --project 5 && deactivate

leading dot . is an alternative for source available on /bin/sh (environment of cron) and not only by /bin/bash. This avoid to set the environment variable SHELL=’/bin/bash’ on crontab.

The resulting cron entry would be:

20 3 28 * * . /usr/local/etc/virtualenvs/sentry/bin/activate && env SENTRY_CONF='/usr/local/etc/sentry' sentry cleanup --days 749 --project 5 && deactivate

It isn’t a bad idea to add a fallback cleanup command the day after, so if you forget to cleanup logs for a specific project it will be done automatically:

20 3 29 * * . /usr/local/etc/virtualenvs/sentry/bin/activate && env SENTRY_CONF='/usr/local/etc/sentry' sentry cleanup --days 749 && deactivate

Now even your Sentry logs are GDPR compliant. The power of this method is that you can set a different cleanup limit for every project, according to its policies. And you haven’t to use any proprietary software to do this, just free/libre open source software.

If you are in a hurry to publish privacy policies and you have a dedicated hosting, give a try to JournaKit legalazy on GitHub.

* Plus it’s written on top of Django.

Cannot connect to wired connection on Ubuntu (SOLVED)

When your Wireless interface is working and the ethernet isn’t working on Ubuntu, here’s a quick howto to check and fix a misconfiguration. It doesn’t solve any ethernet issues but you can give a try and on an Asus laptop (with JMicron chipset) I worked on it makes the job done.

Tested on Ubuntu 16.04 LTS

First steps

To detect Ethernet interface:

ifconfig

To check and configure connection:

apt-get install ethtool

To save the current status of network interface:

ethtool ens5f5 > ethernet_before.txt

Make ethernet interface works

ethtool -s ens5f5 speed 1000 duplex full autoneg on

or:

ethtool -s ens5f5 speed 100 duplex full autoneg on

Then to check what is the difference between the old non-working configuration and the configuration that works:

ethtool ens5f5 > ethernet_after.txt
diff ethernet_before.txt ethernet_after.txt

If it doesn’t work try other ways, e.g. looking for specific issue on your Ethernet driver:

lspci | grep Ethernet

or

lspci | grep ethernet

to check your driver.

If the issue reappears after reboot, to make the command to run on startup do:

sudo bash
crontab -e

And add:

@reboot /sbin/ethtool -s ens5f5 speed 100 duplex full autoneg on

Now reboot to check if changes takes effect

Delete git files from public GitHub history

To delete git files uploaded accidentally to GitHub (or any other public repository) do these steps:

  1. Download https://rtyley.github.io/bfg-repo-cleaner/ as suggested by GitHub
  2. git clone –mirror GIT_REPOSITORY_URL
  3. cd path/to/cloned/repository
  4. Download BFG
  5. java -jar /path/to/download/dir/bfg-VERSION.jar –delete-files filename.ext
  6. Run the command specified by BFG (usually git reflog expire –expire=now –all && git gc –prune=now –aggressive)
  7. git push

If you get an error on pull, probably you haven’t cloned the repository as step 2.

Browsing the public history, any reference to the filename.ext file disappear.

Read more about BFG and the –mirror option on this discussion.

Mass delete old email on Gmail preserving Special and Tagged ones

To mass delete old emails on Gmail type this search query in the search box of mail.google.com (or Gmail for Business):

after:2017/01/01 before:2017/31/12 -has:userlabels -is:starred

You can use these filters in any language but remember to use the YYYY/DD/MM format for the data (Year/Day/Month) for the after and before filters.

This search will show you all emails between January, 1st and December, 31st 2017 that:

  • Haven’t any User Label
  • Aren’t starred (without Star)

Change dates according to the time period you want to cover and select the select all checkbox inside the header to select all items from the Gmail dashboard.

Optionally, you can select them all using the dedicated link that appears after the step above.

These two criteria are usually enough to don’t delete important e-mails but you can add more exclusion criteria adding a minus sign before any new filter, e.g. unread. However, if you don’t use Stars and Labels you have to double-check email in the list before deletion to prevent to delete useful data.

This approach is very useful in these two scenarios:

  • To free space on the Gmail mailbox when it’s almost full.
  • To delete old emails to comply with regulations like GDPR at the end of their usable life.

Happy houseworks!

How to shrink a scanned PDF on Linux

When you want to reduce the file size of a PDF document, this quick command using convert will shrink the original PDF file.

convert -density 150x150 -quality 60 -compress jpeg -colorspace Gray original.pdf new.pdf

This command is particularly useful against scanned documents, the jpeg quality will be 60% for 150dpi.

Converting an original 300dpi / color PDF to a 150dpi, greyscale PDF can reduce file size up to 50%. There will be some quality loss but in this way you can reduce file size enough to send scanned documents of dozens of pages via e-mail without using third-party services.

How to import .ovpn files on Ubuntu Linux network manager

On Linux you don’t need to install OpenVPN because it’s already installed. However, configuration especially via the network manager can be tricky.

Install this additional package on your distro to display a new OpenVPN option in the network manager:

sudo apt-get install network-manager-openvpn-gnome

If you’re migrating from Windows and you’ve already a Windows installation of OpenVPN you can copy .key, .crt, .conf and .ovpn files from the OpenVPN location. Copy these files to your Linux home (e.g. ~/openvpn/) and reshape permissions to allow the access only to the owner.

After you’ve the .ovpn, .crt, .key files locally, you can test the connection using these commands:

cd ~/openvpn/;
sudo openvpn my-openvpn-file.ovpn

Type the sudo password, wait and the connection should be established successfully. Press Ctrl+C to stop the VPN from command line.

Now you can configure the Network Manager to accept the .ovpn file.

Click on the network icon on the top right corner of the screen, click current connection, select settings and look for VPN Settings from the opened window.

Click the + icon aside the VPN title and select Import from file…

Select the my-openvpn-file.ovpn you’ve checked before. A form containing user certificate, CA, private key and the gateway will be automatically filled. Input the password in the last field when needed.

It’s very important to select .ovpn and not .conf since the latter will not work.

If the private key is password protected you can also type the password and on Advanced you can do some fine tuning but it’s usually unnecessary.

On the Details tab, uncheck the automatic connection option if you don’t want to start the VPN at every login and choose if you want to allow other users to access the connection.

On IPv4 and IPv6 you can disable a specific protocol or limit the connection to “Use this connection only for resources on its network“. This last step is particularly important because using VPN can limit network connection.

Press Apply and you should be able to connect pressing the network icon on the top right corner > VPN > your VPN name.

To list saved connections:

 nmcli c 

Programmatically connect / disconnect to VPN

If you need to write a script to use this imported connection, you can use openvpn command but you have to set all the parameters manually.

To reuse the saved connection instead, you can simply use nmcli to connect:

 nmcli con up id my-connection-name 

And disconnect:

 nmcli con down id my-connection-name 

 

Tested on Ubuntu 17 and 18.