Removing PDF invisible objects with iTextSharp

Question

Is possible to use iTextSharp to remove from a PDF document objects that are not visible (or at least not being displayed)?

More details:

1) My source is a PDF page containing images and text (maybe some vectorial drawings) and embedded fonts.

2) There's an interface to design multiple 'crop boxes'.

3) I must generate a new PDF that contains only what is inside the crop boxes. Anything else must be removed from resulting document (indeed I may accept content which is half inside and half outside, but this is not the ideal and it should not appear anyway).

My solution so far:

I have successfully developed a solution that creates new temporary documents, each one containing the content of each crop box (using writer.GetImportedPage and contentByte.AddTemplate to a page that is exactly the size of the crop box). Then I create the final document and repeat the process, using the AddTemplate method do position each "cropped page" in the final page.

This solution has 2 big disadvantages:

the size of the document is the [original size] * [number of crop boxes], since the entire page is there, stamped many times! (invisible, but it's there)
the invisible text may still be accessed by selecting all (CTRL+A) within Reader and pasted.

So, I think I need to iterate through PDF objects, detect if it is visible or not, and delete it. At the time of writing, I am trying to use pdfReader.GetPdfObject.

Thanks for the help.

As iText provides a low level API which allows you to manipulate nearly everything in a document, it is possible. That is not to say that it is easy, though, as you will have to write the code yourself to identify for each element in the page content whether or not it is visible, and you will have to glue together the remaining parts of the content yourself, too. You can reduce the resulting document size in your current solution, though, if you reuse an imported page template if multiple sections of it are to be made visible. Interesting work for many weeks... — mkl
Try using the PdfStamper class for cropping: itextpdf.com/examples/iia.php?id=231 — Markus Palme
I'm not a 100 percent on this as far as iTextSharp is concerned but iPdfSharp has the ability to render from forms. the idea is that you open your page, that you are cropping, inside a form and then render out only the parts you need into a new document. You will not be making multiple copies and the rendered (cropped) parts will be images. Try to see if this is an option under IText api. — Alex
Due to time restrictions, I decided to use another PDF framework to accomplish what I need. For that I used the AmyUni PDF Creator .NET, a simple yet nice library. It has it`s own bugs though, but I'm interacting with them to solve. — Hetote
Have you looked at ABCPdf? If I'm correct it can do exactly what you want to do, and pricing is about the same as the AmyUni lics. — Peter R

Praveena M Praveena M · Accepted Answer · 2013-09-18T05:47:30

If the PDF which you are trying is a template/predefined/fixed then you can remove that object by calling RemoveField.

PdfReader pdfReader = new PdfReader(../Template_Path.pdf"));
PdfStamper pdfStamperToPopulate = new PdfStamper(pdfReader, new FileStream(outputPath, FileMode.Create));
AcroFields pdfFormFields = pdfStamperToPopulate.AcroFields;
pdfFormFields.RemoveField("fieldNameToBeRemoved");

Removing PDF invisible objects with iTextSharp

5 Answers