I have been trying to mergePage with PyPDF2 using the same foreground to multiple pages in multiple documents with the following loop.
for item in file_list: # loops through 16 pdf files
print("Processing " + item)
if item.endswith(".pdf"):
output_to_file = "/Users/" + getuser() + "/Target/" + item
background = PdfFileReader(open(source_files + item, "rb"))
page_count = background.getNumPages()
for n in range(page_count):
x, y, w, h = background.getPage(n).mediaBox # get size of mediaBox
if w > h:
foreground = PdfFileReader(open("b_landscape.pdf", "rb"))
else:
foreground = PdfFileReader(open("b_portrait.pdf", "rb"))
input_file = background.getPage(n)
input_file.mergePage(foreground.getPage(0))
output.addPage(input_file)
with open(output_to_file, "wb") as outputStream:
output.write(outputStream)
The result is a series of pdf flies with increasing size i.e. the first file is about 6MB, and after the 16th loop the resulting file about 70MB. What seems to be happening is that the foreground image is being carried into the next loop. I have tried reinitialising the PageObject (input_file) with
input_file = None
to no avail. If anyone has a suggestion, it would be most appreciated.