Rob van der Linde

Open Source Software Developer

Posts tagged by pdf

LaTeX and Django

Posted on August 11, 2011 by: Rob van der Linde
Filed under: Django, Python
Tags: latex, pdf, python, django

In the last week, I have been learning how to use LaTeX, a markup style language used to create professional documents. I have been using Texmaker on Linux, an excellent cross-platform LaTeX editor. I thought it would be pretty cool to generate PDF documents from a Django view using the pdflatex command line app, and using the Django template engine for adding dynamic content to .tex files before rendering them to PDF.

Here is the code I wrote that does this, it renders the LaTeX template to a string first. It then uses tempfile.mkstemp() to generate a unique temporary file name for use for the generated PDF file, mkstemp returns both the filename and a file descriptor to the temporary file. When we load the contents of the PDF file into a string, we should use os.fdopen() with this file descriptor, rather than just use open(), which is the proper way to work with temporary files.

When we run pdflatex, we pipe the rendered .tex template to stdin, so we don't have to save it to disk. Unfortunately, pdflatex cannot return a generated PDF over stdout the same way and it only saves a PDF to disk, which is why we have to use the temporary file method.

After the PDF is rendered, we load ithe PDF file and return it as an attachment over the response. We then delete any temporary files created, including the PDF file.

import os
from subprocess import Popen, PIPE
from tempfile import mkstemp
 
from django.http import HttpResponse, Http404
from django.template.loader import render_to_string
from django.template import RequestContext
 
def render_latex(request, template, dictionary, filename):
    # render latex template and vars to a string
    latex = render_to_string(template, dictionary, context_instance=None)
 
    # create a unique temorary filename
    fd, path = mkstemp(prefix="latex_", suffix=".pdf")
    folder, fname = os.path.split(path)
    jobname, ext = os.path.splitext(fname)  # jobname is just the filename without .pdf, it's what pdflatex uses
 
    # for the TOC to be built, pdflatex must be run twice, on the second run it will generate a .toc file
    for i in range(2):
        # start pdflatex, we can send the tex file from stdin, but the output file can only be saved to disk, not piped to stdout unfortunately/
        process = Popen(["pdflatex", "-output-directory", folder, "-jobname", jobname], stdin=PIPE, stdout=PIPE)  # piping stdout suppresses output messages
        process.communicate(latex)
 
    # open the temporary pdf file for reading.
    try:
        pdf = os.fdopen(fd, "rb")
        output = pdf.read()
        pdf.close()
    except OSError:
        raise Http404("Error generating PDF file")  # maybe we should use an http  500 here, 404 makes no sense
 
    # generate the response with pdf attachment
    response = HttpResponse(mimetype="application/pdf")
    response["Content-Disposition"] = "attachment; filename=" + filename
    response.write(output)
 
    # delete the pdf from temp directory, and other generated files ending on .aux and .log
    for ext in (".pdf", ".aux", ".log", ".toc", ".lof", ".lot", ".synctex.gz"):
        try:
            os.remove(os.path.join(folder, jobname) + ext)
        except OSError:
            pass
 
    # return the response
    return response
 
#### Actual Usage ####
 
def someview(request):
    return render_latex(request, "reports/test.tex", {"foo": "bar"}, filename="latex_test.pdf")