
I am using PDFBox (v2.0.13) to merge PDF files.
These files are

And the merged file is

Can I remove the blank which will make the 2nd page become 1st page ?
About merge code, I use pdfbox github example code :https://github.com/apache/pdfbox/blob/trunk/examples/src/main/java/org/apache/pdfbox/examples/util/PDFMergerExample.java

The table in html and it's parent elements' margin and padding is 0. code like below

<div class="table-wrap">
<table id="arOpenItemDetail_save" border="0" cellspacing="1" cellpadding="1"  class="table-Y" name="detail">
    <THEAD style="display:table-header-group;font-weight:bold" name="detailHeader">
        <th width="20">Order Type</th>
        <th>Order No</th>
        <th>Doc Terms</th>
        <th>Doc Date</th>
        <th>Due Date</th>
        <th>Days PastDue</th>
        <th>Doc Amount</th>
        <th>Reason Code</th></tr>
    <span th:each="detail:${list}">
        <tr class="odd">
            <td align="right" width="20" th:text="${detail.custNo}">1</td>
            <td align="center" width="20" th:text="${detail.custNo}">1</td>
            <td align="right"    th:text="${detail.custNo}">1</td>
            <td align="center" th:text="${detail.custNo}">1</td>
            <td align="right"   th:text="${detail.custNo}">1</td>
            <td align="right"   th:text="${detail.custNo}">1</td>
            <td align="right"   th:text="${detail.custNo}"></td>
            <td align="right"   th:text="${detail.custNo}"></td>

            <td align="right"   th:text="${detail.custNo}"></td>
            <td align="right"   th:text="${detail.custNo}"></td>
            <td align="right"   th:text="${detail.custNo}"></td>
            <td align="left"   th:text="${detail.custNo}"></td>
            <td align="left"   th:text="${detail.custNo}"></td>
            <td align="left" th:text="${detail.custNo}"></td>
Usually merge methods for pdfs only merge on a page-basis, i.e. they take the pages from the documents to merge and create a new document with all those pages. Often a more dense merge (putting the contents of multiple pages on a single result page) is not feasible due to headers, footers, background graphics and other artifacts which would have to be recognized and ignored in this context. For pages like yours a dense merge is feasible, merely not provided as a single utility method yet.mkl
Appreciate your answer, and if I have to achieve dense merge, how to do it ? Actually, I just want to generate a PDF from a big html (it's style is easy to do with), but renderer.createPDF(outputStream) mehod is too slow and blocked. So I switch this way which generate the PDF with many files, and merge them finally.Marvin

This question essentially is about a dense merging of multiple PDF pages from one or more PDFs.

One can implement such a utility class like this:

public class PdfDenseMergeTool {
    public PdfDenseMergeTool(PDRectangle size, float top, float bottom, float gap)
        this.pageSize = size;
        this.topMargin = top;
        this.bottomMargin = bottom;
        this.gap = gap;

    public void merge(OutputStream outputStream, Iterable<PDDocument> inputs) throws IOException
            for (PDDocument input: inputs)
            if (currentContents != null) {
                currentContents = null;


    void openDocument() throws IOException
        document = new PDDocument();

    void closeDocument() throws IOException
            if (currentContents != null) {
                currentContents = null;
            this.document = null;
            this.yPosition = 0;

    void newPage() throws IOException
        if (currentContents != null) {
            currentContents = null;
        currentPage = new PDPage(pageSize);
        yPosition = pageSize.getUpperRightY() - topMargin + gap;
        currentContents = new PDPageContentStream(document, currentPage);

    void merge(PDDocument input) throws IOException
        for (PDPage page : input.getPages())
            merge(input, page);

    void merge(PDDocument sourceDoc, PDPage page) throws IOException
        PDRectangle pageSizeToImport = page.getCropBox();
        BoundingBoxFinder boundingBoxFinder = new BoundingBoxFinder(page);
        Rectangle2D boundingBoxToImport = boundingBoxFinder.getBoundingBox();
        double heightToImport = boundingBoxToImport.getHeight();
        float maxHeight = pageSize.getHeight() - topMargin - bottomMargin;
        if (heightToImport > maxHeight)
            throw new IllegalArgumentException(String.format("Page %s content too large; height: %s, limit: %s.", page, heightToImport, maxHeight));

        if (gap + heightToImport > yPosition - (pageSize.getLowerLeftY() + bottomMargin))
        yPosition -= heightToImport + gap;

        LayerUtility layerUtility = new LayerUtility(document);
        PDFormXObject form = layerUtility.importPageAsForm(sourceDoc, page);

        Matrix matrix = Matrix.getTranslateInstance(0, (float)(yPosition - (boundingBoxToImport.getMinY() - pageSizeToImport.getLowerLeftY())));

    PDDocument document = null;
    PDPage currentPage = null;
    PDPageContentStream currentContents = null;
    float yPosition = 0; 

    final PDRectangle pageSize;
    final float topMargin;
    final float bottomMargin;
    final float gap;

(PdfDenseMergeTool utility class)

It uses the BoundingBoxFinder class from this answer to an older question.

You can use the PdfDenseMergeTool like this:

PDDocument document1 = ...;
PDDocument document2 = ...;
PDDocument document3 = ...;
PDDocument document4 = ...;
PDDocument document5 = ...;

PdfDenseMergeTool tool = new PdfDenseMergeTool(PDRectangle.A4, 30, 30, 10);
tool.merge(new FileOutputStream("Merge with Text.pdf"),
        Arrays.asList(document1, document2, document3, document4, document5,
                document1, document2, document3, document4, document5,
                document1, document2, document3, document4, document5));

To merge the five source documents three times in a row.

In case of my test documents (each source documents containing three lines of text) I get this result:

Page 1:

result page 1

Page 2:

result page 2

This utility class essentially is a port of the PdfDenseMergeTool for iText in this answer.

It has been tested with the current PDFBox 3.0.0 development branch SNAPSHOT.