Issues with iTextsharp and pdf manipulation

Question

I am getting a pdf-document (no password) which is generated from a third party software with javascript and a few editable fields in it. If I load this pdf-document with the pdfReader class the NumberOfPagesProperty is always 1 although the pdf-document has 17 pages. Oddly enough the document has 17 pages if I save the stream afterwards. When I now try to open the document the Acrobat Reader shows an extended feature warning and the fields are not fillable anymore (I haven't flattened the document). Do anyone know about such a problem?

Background Info: My job is to remove the javascript code, fill out some fields and save the document afterwards. I am using the iTextsharp version 5.5.3.0.

Unfortunately I can't upload a sample file because there are some confidental data in it.

private byte[] GetDocumentData(string documentName)
{
    var document = String.Format("{0}{1}\\{2}.pdf", _component.OutputDirectory,     _component.OutputFileName.Replace(".xml", ".pdf"), documentName);

    if (File.Exists(document))
    {
        PdfReader.unethicalreading = true;

        using (var originalData = new MemoryStream(File.ReadAllBytes(document)))
        {
            using (var updatedData = new MemoryStream())
            {                      
                var pdfTool = new PdfInserter(originalData, updatedData) {FormFlattening = false};
                pdfTool.RemoveJavascript();
                pdfTool.Save();  

                return updatedData.ToArray();
            }
        }
    }

    return null;
}

//Old version that wasn't working
public PdfInserter(Stream pdfInputStream, Stream pdfOutputStream)
{
    _pdfInputStream = pdfInputStream;
    _pdfOutputStream = pdfOutputStream;
    _pdfReader = new PdfReader(_pdfInputStream);
    _pdfStamper = new PdfStamper(_pdfReader, _pdfOutputStream);
}

//Solution
public PdfInserter(Stream pdfInputStream, Stream pdfOutputStream, char pdfVersion = '\0', bool append = true)
{
    _pdfInputStream = pdfInputStream;
    _pdfOutputStream = pdfOutputStream;
    _pdfReader = new PdfReader(_pdfInputStream);
    _pdfStamper = new PdfStamper(_pdfReader, _pdfOutputStream, pdfVersion, append);
}

public void RemoveJavascript()
{
    for (int i = 0; i <= _pdfReader.XrefSize; i++)
    {
        PdfDictionary dictionary = _pdfReader.GetPdfObject(i) as PdfDictionary;

        if (dictionary != null)
        {
            dictionary.Remove(PdfName.AA);
            dictionary.Remove(PdfName.JS);
            dictionary.Remove(PdfName.JAVASCRIPT);
        }
    }
}

Generally: Please supply some source and sample PDF to allow others to reproduce your issue. — mkl
To make an educated guess: NumberOfPagesProperty is always 1 although the pdf-document has 17 pages - quite likely you have a PDF with a XFA form, i.e. the PDF is only a cariier of some XFA data from which Adobe Reader builds your 17 pages. The actually PDF in that case usually only contains one page saying something like "if you see this, your viewer does not support XFA". — mkl
When I now try to open the document the Acrobat Reader shows an extended feature warning - this usually indicates that the original PDF has been signed using a usage rights signatures to "Reader-enable" it, i.e. to tell the Adobe Reader to activate some additional features. If you stamp such a file, use append mode. Otherwise the signature is broken. — mkl
As guessed in my previous comment, you do not use the PdfStamper in append mode. Thus, you break the usage rights signature. Read more here. — mkl

mkl mkl · Accepted Answer · 2014-12-09T10:50:45

The extended feature warning is a hint that the original PDF had been signed using a usage rights signature to "Reader-enable" it, i.e. to tell the Adobe Reader to activate some additional features when opening it, and the OP's operation on it has invalidated the signature.

Indeed, he operated using

_pdfStamper = new PdfStamper(_pdfReader, _pdfOutputStream);

which creates a PdfStamper which completely re-generates the document. To not invalidate the signature, though, one has to use append mode as in the OP's fixed code (for char pdfVersion = '\0', bool append = true):

_pdfStamper = new PdfStamper(_pdfReader, _pdfOutputStream, pdfVersion, append);

If I load this pdf-document with the pdfReader class the NumberOfPagesProperty is always 1 although the pdf-document has 17 pages. Oddly enough the document has 17 pages

Quite likely it is a PDF with a XFA form, i.e. the PDF is only a carrier of some XFA data from which Adobe Reader builds those 17 pages. The actual PDF in that case usually only contains one page saying something like "if you see this, your viewer does not support XFA."

For a final verdict, though, one has to inspect the PDF.

Issues with iTextsharp and pdf manipulation

1 Answers