2
votes

I am using PDFBox to generate a bunch of invoices in a loop. This is working in general, but unfortunately I am getting the following exception from time to time in the loop. Starting the generation again once or twice for the failed invoices will create all of them sooner or later.

java.io.IOException: COSStream has been closed and cannot be read. Perhaps its enclosing PDDocument has been closed?
at org.apache.pdfbox.cos.COSStream.checkClosed(COSStream.java:83)
at org.apache.pdfbox.cos.COSStream.createRawInputStream(COSStream.java:133)
at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1202)
at org.apache.pdfbox.cos.COSStream.accept(COSStream.java:400)
at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:521)
at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObjects(COSWriter.java:459)
at org.apache.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:443)
at org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1096)
at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:417)
at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1369)
at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1256)
at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1279)
at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1250)
at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1238)
at de.xx.xxx.CreateLandscapePDF.createPdf(CreateLandscapePDF.java:37)
at de.xx.xxx.CreateInvoiceAsPDF.createPdf(CreateInvoiceAsPDF.java:172)
...

I have already looked into some similar questions like here PDFbox saying PDDocument closed when its not and I just think that it has something to do with freed objects by the garbage collector, but I do not see the fault in my code.

For the creation of the PDF itself I am using in general the description of Apache PDFBox Cookbook at https://pdfbox.apache.org/1.8/cookbook/documentcreation.html. I more or less only add more content, an image, some text blocks, a table and so on.

public class CreateLandscapePDF {

private ArrayList<ContentBlock> content;
private PDRectangle pageDIN;
private PDDocument doc;

public CreateLandscapePDF(ArrayList<ContentBlock> content, PDRectangle pageDIN) {
    this.content = content;
    this.pageDIN = pageDIN;
}

public void createPdf(String pdfFileName) throws IOException
{
    doc = new PDDocument();

    PDPage page = new PDPage(pageDIN);
    doc.addPage(page);
    PDPageContentStream contentStream = new PDPageContentStream(doc, page, PDPageContentStream.AppendMode.OVERWRITE, false);

    for (ContentBlock contentBlock : content) {
        contentBlock.getContentHelper().writeContentToPDF(contentStream);
        contentStream.moveTo(0, 0);
    }
    contentStream.close();
    doc.save( pdfFileName );
    doc.close();
}

}

In my creation process I have the loop in the CreateInvoiceAsPDF.createPdf method. In this loop I create always new objects of CreateLandscapePDF.

CreateLandscapePDF pdf = new CreateLandscapePDF(contentList, PDRectangle.A4);
pdf.createPdf(TEMP_FILEPATH_NAME + pdfFileName);

The writeContentToPDF method only places the several content like text, images and lines at a defined pixel unit into the page. As an example I put the code from my TextContentHelper:

    public void writeContentToPDF(PDPageContentStream contentStream) throws IOException {
    float maxTextWidth = 1;
    contentStream.beginText();
    float fontSize = content.getFontSize();
    PDFont font = content.getFont();
    contentStream.setFont(font, fontSize);
    contentStream.setLeading(content.getLineSpace() * fontSize);
    float xPos =0;
    for (Object text : content.getContent()) {
        if (text instanceof String) {
            float textWidth = UnitTranslator.getPixUnitFromTextLength(font, fontSize, (String) text);
            switch (content.getAlignment()) {
            case CENTER:
                xPos = 0.5f*(content.getXEndPosition()+content.getXPosition()-textWidth);
                contentStream.newLineAtOffset(xPos, content.getYPosition());
                break;
            case RIGHT:
                xPos = content.getXEndPosition()-textWidth;
                contentStream.newLineAtOffset(xPos, content.getYPosition());
                break;
            default:
                xPos = content.getXPosition();
                contentStream.newLineAtOffset(xPos, content.getYPosition());
                break;
            }
            contentStream.showText((String) text);
            contentStream.newLine();
            contentStream.newLineAtOffset(-xPos, -content.getYPosition());
            if (textWidth > maxTextWidth) {
                maxTextWidth = textWidth;
            }
        }
    }
    contentStream.endText();
    if (content.isBorder()) {
        createTextBlockBorder(contentStream, maxTextWidth, fontSize);

    }
}

I appreciate any hint to solve this annoying problem!

1
The exception usually comes if you've closed the COSStream before, e.g. because it was part of another PDDocument. So I wonder what else is done in writeContentToPDF. Please do also make sure you're using the latest PDFBox version (2.0.13) and the latest java. That is 1.8.202 (or 201) or 11.0.2. - Tilman Hausherr
Tilman, thanks for your reply! At the moment I am using pdfbox 2.0.9, I tried already version 2.0.13, but saw no difference. With latest Java I have no chance, because my code is running in a Lotus Notes environment, which runs Java 1.6.0, I cannot update :-( The writeContentToPDF methods doesn't do magic. It just takes the content like text, images, lines and places it at a specific pixel unit. I will add an example from my TextContentHelper above. - Roland
Where does content.getFont() come from? Was the font object generated for THAT PDDocument? Or is it global for all PDFs or for a group of PDFs? (Which won't work) - Tilman Hausherr
Tip for debugging: look at the destination file with an editor like NOTEPAD++, you'll see an incomplete stream at the bottom. Post the last few lines, starting from the last line that has "number 0 obj". That will indicate what kind of COSStream is in trouble. - Tilman Hausherr
Also look at the log outputs... if my theory is correct, you'll find some warnings that you have unclosed documents. This could happen if you passed new PDDocument to a font creation method, and that PDDocument is of course unreferenced, so it would be closed automatically at some later time. - Tilman Hausherr

1 Answers

1
votes

1) The COSStream has been closed and cannot be read exception when saving is best analysed by looking at the end of the partially saved file. Open it with NOTEPAD++, you'll see an incomplete stream at the bottom. Post the last few lines, starting from the last line that has "number 0 obj". That will indicate what kind of COSStream is in trouble.

2) Your file showed an image XObject ("/Type /XObject /Subtype /Image").

3) Further research showed that you created your image with

PDImageXObject pdImage = PDImageXObject.createFromByteArray(new PDDocument(), ...);

and you sporadically also got the warning Warning: You did not close a PDF Document.

This is because your new PDDocument() object is passed to the createFromByteArray method but isn't kept, PDFBox needs it only to get the memory management stuff of that PDDocument ("scratch file"). So later (garbage collection) this unreferenced PDDocument is finalized, and closes all related streams, which includes the image stream you created.

So the solution is to pass the PDDocument of your own document, not some temporary object.

4) Note that this also applies to fonts, so don't pass new PDDocument() to a font creation method. (not applicable to you, but maybe to people in the future).