Create nice unicode PDF using Python

Today I started one of the less motivating activities in Python 2.x: encoding.

In Python 3 unicode will be everywhere, but as of the 2.6 version I’ve on one of the server I have to endure.

Objective: get data from a UTF-8 encoded json and print a nice PDF.

Tools: json, urllib2, fpdf, cgi

What you need:
pyfpdf: https://code.google.com/p/pyfpdf/downloads/list

  • Download fpdf-1.7.hg.zip or more recent
  • Unzip, enter the directory and python setup.py install
  • locate fpdf
  • cd /usr/lib/python2.6/site-packages/fpdf (or the directory name you got with locate)
  • Download unicode fonts for fpdf
  • Unzip and copy the fonts folder in the fpdf directory

Now you have a working FPDF with unicode support and unicode fonts. Start to write your script, I assume you’re using python 2.6, if not change python2.6 to your python version (e.g. 2.7) or remove version number in the heading (just python). As now FPDF works with Python 2.5 to 2.7.

Here I write a simple cgi-bin script, so you have to put it in the /var/www/cgi-bin directory (CentOS) or in /usr/lib/cgi-bin (Debian).

#!/usr/bin/env python2.6
#-*- coding: utf-8 -*-
from fpdf import FPDF
import json
import urllib2
import os
import cgi
import sys
# set system encoding to unicode
import sys
reload(sys)
sys.setdefaultencoding("utf-8")

Now get some arguments from url. These will be used to compile a query to a external json service.

# e.g. http://example.com/cgi-bin/myscript.py?lang=en&sid=2
sid = arguments.getlist('sid')[0]
lang = arguments.getlist('lang')[0]
# compile a request to get a particular element from an external json
dataurl = "http://example.com/external-json-source?lang=%s&sid=%s" % (lang, sid)
# load json from dataurl and convert into python elements
data = json.load(urllib2.urlopen(dataurl))
# the json has a user attribute: the user attribute has name and surname attributes as strings
user = data['user']
# title is a simple string
title = data['title']

Now you have to load the json from the external source. Json must be encoded in UTF-8:

lato_lungo = 297
lato_corto = 210
pdf = FPDF('L','mm','A4')
# add unicode font
pdf.add_font('DejaVu','','DejaVuSansCondensed.ttf',uni=True)
pdf.add_page()
pdf.cell(w=lato_lungo,h=9,txt=title,border=0,ln=1,align='L',fill=0)
pdf.set_font('DejaVu','',12)
# paragraphs rendered as MultiCell
# @see https://code.google.com/p/pyfpdf/wiki/MultiCell
# print key: values for each user['data'] dictionary attributes
for val in user.iteritems():
    pdf.multi_cell(w=0,h=5,txt="%s: %s" % val)
# finally print pdf
print pdf.output(dest='S')

Now:

  1. Open your browser and visit http://example.com/cgi-bin/myscript.py?lang=en&sid=2
  2. The external source http://example.com/external-json-source?lang=en&sid=2 is grabbed and converted into a python data structure. Both source and destination encoding are unicode utf-8.
  3. Data from external source are used to create the pdf.

You can use as many fonts as you have in the fpdf/font directory, just add those using pdf.add_font().

https://code.google.com/p/pyfpdf/downloads/list

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s