0
votes

I use iText 7 http://itextpdf.com/ libraries: itext7-io-7.0.2.jar, itext7-kernel-7.0.2.jar, itext7-layout-7.0.2.jar, slf4j-api-1.7.25.jar, slf4j-simple-1.7.25.jar in the project that sets the background image of the document page and saves several similar pages in the same document (pdf-file). The image file

final String IMAGE = "/resources/image.jpg";

must be stored as a resource in the jar-file. The ImageData object is created using the method create(java.awt.Image image, java.awt.Color color) of com.itextpdf.io.image.ImageDataFactory:

ImageData imgData = ImageDataFactory.create(new Main().loadImage(IMAGE), true);

The code of the java.awt.image.BufferedImage loadImage(String imageFilename) method is:

private BufferedImage loadImage(String imageFilename) {
    BufferedImage img = null;
    try {
        img = javax.imageio.ImageIO.read(getClass().getResourceAsStream(imageFilename));
    } catch (IOException ex) {
        Logger.getLogger(Main.class.getName()).log(Level.SEVERE, null, ex);
    }
    return img;
}

The ImageData object is used in the loop:

Document document = ... (get Document object)
PdfDocument pdf = document.getPdfDocument();
PageSize pageSize = pdf.getDefaultPageSize();
PdfPage page = pdf.addNewPage();
for (int i = 0; i < documents.size(); i++) {
        PdfCanvas canvas = new PdfCanvas(page);
        canvas.addImage(imgData, pageSize, false);
        ... (add document body)
        if (i < documents.size() - 1) {
            page = pdf.addNewPage();
            document.add(new AreaBreak(AreaBreakType.NEXT_PAGE));
        }
}
document.close();

The problem is that after I run the program from jar-file, I get a pdf-document that is much larger than when I run the program from IDE using direct image reference (81 Mbytes vs 9 MBytes for 17 page document):

ImageData imgData = ImageDataFactory.create("src/resources/image.jpg");

If you create an ImageData object from the bytes of the image using the method create(byte[] bytes, boolean recoverImage) of com.itextpdf.io.image.ImageDataFactory:

ImageData imgData = ImageDataFactory.create(new Main().loadImageByte(IMAGE), true); 

and use byte[] loadImageByte(String imageFilename) method:

private byte[] loadImageByte(String imageFilename) {
    byte[] dataBytes = null;
    try {
        InputStream is = getClass().getResourceAsStream(imageFilename);
        dataBytes = new byte[is.available()];
        is.read(dataBytes);
    } catch (IOException ex) {
        Logger.getLogger(Main.class.getName()).log(Level.SEVERE, null, ex);
    }
    return dataBytes;
}

the size of the resulting pdf-document is small both when you run the program from IDE, or when you start from the jar-file. However, in the latter case, the document does not open, Adobe Acrobat 9 gives an error "Insufficient data for an image" (and bytes of both documents are different).

What is the reason for the difference between file sizes and is there a way to get a small size pdf-document by program starting from a jar-file?

1
I'd need to go through everything in detail to be sure, but a possible explanation regarding different file sizes is that in the case of a smaller size, the image is stored once as an XOBject and referenced throughout the document and in the case of the larger file, the image is added in it's entirety every time. - Samuel Huylebroeck
Concerning your second attempt going via loadImageByte: you use is.available() to determine the size of the resource file. This is wrong, though, is.available() might well return a smaller value depending on the InputStream class actually used, and the InputStream class actually used might differ between runs from IDE with resources being in the file system and runs from jar with resources compressed in a JAR archive. - mkl
That been noted, can you share the image file for reproducing the issue? - mkl
Thank You for reply. I read Introducing images (developers.itextpdf.com/content/itext-7-building-blocks/…) and hoped that the method create(java.awt.Image image, java.awt.Color color) of com.itextpdf.io.image.ImageDataFactory would use XObject, but obviously it is not. - Gennady Kolomoets
Concerning your second reply, Yes, sometimes the method ImageDataFactory.create(new MainSOQ().loadImageByte(IMAGE), true); throws an exception Premature EOF while reading JPEG. - Gennady Kolomoets

1 Answers

0
votes

The problem is solved by using PdfImageXObject to wrap image data:

public static final String IMAGE = "/resources/image.jpg";
public static final String DEST = "result.pdf";

public static void main(String[] args) throws FileNotFoundException {

    int pageNumber = 5;
    PdfWriter writer = new PdfWriter(DEST);
    PdfDocument pdf = new PdfDocument(writer);
    Document document = new Document(pdf, PageSize.A4.rotate());
    ImageData imgData = ImageDataFactory.create(new Main().loadImageByte(IMAGE), true);
    /*Wrapping image data in a PdfImageXObject*/
    PdfImageXObject imgObject = new PdfImageXObject(imgData);
    /*Calculate the page area for the image - the image size will change*/
    PageSize pageSize = pdf.getDefaultPageSize();
    Rectangle rectangle = new Rectangle(pageSize.getWidth(), pageSize.getHeight());
    /*Loop*/
    PdfPage page = pdf.addNewPage();
    for (int i = 0; i < pageNumber; i++) {
        PdfCanvas canvas = new PdfCanvas(page);
        /*Add background image as PdfImageXObject*/
        canvas.addXObject(imgObject, rectangle);

        (PDF building blocks adding)

        if (i < pageNumber - 1) {
            page = pdf.addNewPage();
            document.add(new AreaBreak(AreaBreakType.NEXT_PAGE));
        }
    }
    document.close();
}

The inspection of the resulting PDF files in the PDFXplorer shows that when you add the same foreground image to the Document object multiple times with the add(Image image) method, the images are added as references to a single PDF XObject. At the same time, a similar multiple addition of the background image to the PdfCanvas object with addImage(ImageData image, Rectangle rect, boolean asInline) method creates several different PDF XObject instances. Pay attention to the difference in the arguments of these methods. An analysis of the iText 7 library source code showed that the Image class has a PdfXObject member that is created every time an Image instance is created, but the ImageData class is not associated with the corresponding PdfXObject.