Cropping the page
In a comment the OP reduced his problem to
Ok. Given a java PDRectangle rect = new PDRectangle(40f, 680f, 510f, 100f)
obtained from TextLocation
how would a java code snippet, that sets the cropBox of a single page look like ? Or how would you do it? TextLocation
based rect --> some transformation --> setCropBox(theRightBox)
.
To set the crop box of the page twelve of the given document to the given PDRectangle
you can use code like this:
PDDocument pdDocument = PDDocument.load(resource);
PDPage page = pdDocument.getPage(12-1);
page.setCropBox(new PDRectangle(40f, 680f, 510f, 100f));
pdDocument.save(new File(RESULT_FOLDER, "ENG-US_NMATSCJ-1.103-0330-page12cropped.pdf"));
(SetCropBox.java test method testSetCropBoxENG_US_NMATSCJ_1_103_0330
)
Adobe Reader now shows merely this part of page twelve:
Beware, though, the page in question does not only specify a media box (mandatory) and a crop box, it also defines a bleed box and an art box. Thus, application which consider those boxes more interesting than the crop box, might display the page differently. In particular the art box (being defined as "the extent of the page’s meaningful content") might by some applications be considered important.
Rendering the cropped page
In a comment to this answer the OP remarked
This is good and works. It correctly saves the page in the PDF file. I've tried to do the same in JPG and failed.
I reduced the OP's code to the essentials
PDDocument pdDocument = PDDocument.load(resource);
PDPage page = pdDocument.getPage(12-1);
page.setCropBox(new PDRectangle(40f, 680f, 510f, 100f));
PDFRenderer renderer = new PDFRenderer(pdDocument);
BufferedImage img = renderer.renderImage(12 - 1, 4f);
ImageIOUtil.writeImage(img, new File(RESULT_FOLDER, "ENG-US_NMATSCJ-1.103-0330-page12cropped.jpg").getAbsolutePath(), 300);
pdDocument.close();
(SetCropBox.java test method testSetCropBoxImgENG_US_NMATSCJ_1_103_0330
)
The result:
Thus, I cannot reproduce an issue here.
Possible details to check for:
ImageIOUtil
is not part of the main PDFBox artifact, instead it is located in pdfbox-tools; does the version of that artifact match the version of the core pdfbox artifact?
- I run the code in an Oracle Java 8 environment; other Java environments might give rise to different results.
- There are minor differences in our implementations. E.g. I load the PDF via an
InputStream
, you directly from file system, I have hardcoded the page number, you have it in some variable, ... None of these differences should cause your problem, but who knows...
getTextMatrix().getTranslateX()
andgetTextMatrix().getTranslateY()
– mkl