8
votes

I have a reportlab SimpleDocTemplate and returning it as a dynamic PDF. I am generating it's content based on some Django model metadata. Here's my template setup:

buff = StringIO()
doc = SimpleDocTemplate(buff, pagesize=letter,
                        rightMargin=72,leftMargin=72,
                        topMargin=72,bottomMargin=18)
Story = []

I can easily add textual metadata from the Entry model into the Story list to be built later:

    ptext = '<font size=20>%s</font>' % entry.title.title()
    paragraph = Paragraph(ptext, custom_styles["Custom"])
    Story.append(paragraph)

And then generate the PDF to be returned in the response by calling build on the SimpleDocTemplate:

doc.build(Story, onFirstPage=entry_page_template, onLaterPages=entry_page_template)

pdf = buff.getvalue()
resp = HttpResponse(mimetype='application/x-download')    
resp['Content-Disposition'] = 'attachment;filename=logbook.pdf'
resp.write(pdf)
return resp

One metadata field on the model is a file attachment. When those file attachments are PDFs, I'd like to merge them into the Story that I am generating; IE meaning a PDF of reportlab "flowable" type.

I'm attempting to do so using pdfrw, but haven't had any luck. Ideally I'd love to just call:

from pdfrw import PdfReader
pdf = pPdfReader(entry.document.file.path)
Story.append(pdf)

and append the pdf to the existing Story list to be included in the generation of the final document, as noted above.

Anyone have any ideas? I tried something similar using pagexobj to create the pdf, trying to follow this example:

http://code.google.com/p/pdfrw/source/browse/trunk/examples/rl1/subset.py

from pdfrw.buildxobj import pagexobj
from pdfrw.toreportlab import makerl

pdf = pagexobj(PdfReader(entry.document.file.path))

But didn't have any luck either. Can someone explain to me the best way to merge an existing PDF file into a reportlab flowable? I'm no good with this stuff and have been banging my head on pdf-generation for days now. :) Any direction greatly appreciated!

4
I think you can do this with the paid version of ReportLab.G Gordon Worley III
Ugh, I don't think the paid version of ReportLab is an option for me, unfortunately. :( Anyone have any alternatives?kyleturner

4 Answers

3
votes

I just had a similar task in a project. I used reportlab (open source version) to generate pdf files and pyPDF to facilitate the merge. My requirements were slightly different in that I just needed one page from each attachment, but I'm sure this is probably close enough for you to get the general idea.

from pyPdf import PdfFileReader, PdfFileWriter

def create_merged_pdf(user):
    basepath = settings.MEDIA_ROOT + "/"
    # following block calls the function that uses reportlab to generate a pdf
    coversheet_path = basepath + "%s_%s_cover_%s.pdf" %(user.first_name, user.last_name, datetime.now().strftime("%f"))
    create_cover_sheet(coversheet_path, user, user.performancereview_set.all())

    # now user the cover sheet and all of the performance reviews to create a merged pdf
    merged_path = basepath + "%s_%s_merged_%s.pdf" %(user.first_name, user.last_name, datetime.now().strftime("%f"))

    # for merged file result
    output = PdfFileWriter()

    # for each pdf file to add, open in a PdfFileReader object and add page to output
    cover_pdf = PdfFileReader(file( coversheet_path, "rb"))
    output.addPage(cover_pdf.getPage(0))

    # iterate through attached files and merge.  I only needed the first page, YMMV
    for review in user.performancereview_set.all():
        review_pdf = PdfFileReader(file(review.pdf_file.file.name, "rb"))
        output.addPage(review_pdf.getPage(0)) # only first page of attachment

    # write out the merged file
    outputStream = file(merged_path, "wb")
    output.write(outputStream)
    outputStream.close()
2
votes

I used the following class to solve my issue. It inserts the PDFs as vector PDF images. It works great because I needed to have a table of contents. The flowable object allowed the built in TOC functionality to work like a charm.

Is there a matplotlib flowable for ReportLab?

Note: If you have multiple pages in the file, you have to modify the class slightly. The sample class is designed to just read the first page of the PDF.

1
votes

I know the question is a bit old but I'd like to provide a new solution using the latest PyPDF2.

You now have access to the PdfFileMerger, which can do exactly what you want, append PDFs to an existing file. You can even merge them in different positions and choose a subset or all the pages!

The official docs are here: https://pythonhosted.org/PyPDF2/PdfFileMerger.html

An example from the code in your question:

import tempfile
import PyPDF2
from django.core.files import File

# Using a temporary file rather than a buffer in memory is probably better
temp_base = tempfile.TemporaryFile()
temp_final = tempfile.TemporaryFile()

# Create document, add what you want to the story, then build
doc = SimpleDocTemplate(temp_base, pagesize=letter, ...)
...
doc.build(...)

# Now, this is the fancy part. Create merger, add extra pages and save
merger = PyPDF2.PdfFileMerger()
merger.append(temp_base)
# Add any extra document, you can choose a subset of pages and add bookmarks
merger.append(entry.document.file, bookmark='Attachment')
merger.write(temp_final)

# Write the final file in the HTTP response
django_file = File(temp_final)
resp = HttpResponse(django_file, content_type='application/pdf')
resp['Content-Disposition'] = 'attachment;filename=logbook.pdf'
if django_file.size is not None:
    resp['Content-Length'] = django_file.size
return resp
0
votes

Use this custom flowable:

class PDF_Flowable(Flowable):
#----------------------------------------------------------------------
def __init__(self,P,page_no):
    Flowable.__init__(self)
    self.P = P
    self.page_no = page_no
#----------------------------------------------------------------------
def draw(self):
    """
    draw the line
    """
    canv = self.canv
    pages = self.P
    page_no = self.page_no
    canv.translate(x, y)  
    canv.doForm(makerl(canv, pages[page_no]))
    canv.restoreState()

and then after opening existing pdf i.e.

    pages = PdfReader(BASE_DIR + "/out3.pdf").pages
    pages = [pagexobj(x) for x in pages]
    for i in range(0, len(pages)):
        F = PDF_Flowable(pages,i)
        elements.append(F)
        elements.append(PageBreak())

use this code to add this custom flowable in elements[].