Digital PDF Document Signing

Question

Update 2:

I have uploaded sample at https://1drv.ms/u/s!Al69FgQ8jwmZbgiBMXLLM4j5sbU?e=vyGF4m

Can you please check. I am stuck at last step. However, please confirm if other appraoch is correct.

Update 1:

I have confirmed the flow. So I am clear on it.

As part of that digital signing PDF document flow , We want to use third party to provide Signed Hash of PDF. Here are steps:

There is 3rd party inhouse system which will generate PDF document from word.
That PDF will be send to another service which will generate Hash value of that PDF
That hash value will be send to external service to sing hash with private key.
external system will send signed hash and public key certiciate using which in house service will add signature in PDF document.

I have following questions.

In point 1 above inhouse service is creating PDF along with signature block . Is it necessary to create signature block? as this is deferred signing?
If so, how can service in point 2 can get original content of PDF document for generating hash.

we have used existing PDF which has signature and using iText 7 to get original content. Is this method correct? FormB.PDF has signature and by removing signaure1 field we are getting original content. Will this process work and advistable?

We also tried to use pdfsigner.getRangeStream() method, but its not that clear in documentation and not yet clear. Please help

package com.abc.sd;

import java.io.IOException;
import java.security.NoSuchAlgorithmException;
import java.util.List;

import com.itextpdf.forms.PdfAcroForm;
import com.itextpdf.kernel.pdf.PdfDocument;
import com.itextpdf.kernel.pdf.PdfReader;
import com.itextpdf.kernel.pdf.PdfWriter;
import com.itextpdf.signatures.SignatureUtil;

public class ItextPdf7 {

    public static void main(String [] args) throws IOException, NoSuchAlgorithmException {
        String filePath ="C:\\\\abc\\\\test\\\\FormB.pdf";
        PdfReader reader = new PdfReader(filePath);
        PdfDocument pdfDoc = new PdfDocument(reader);
        PdfAcroForm form = PdfAcroForm.getAcroForm(pdfDoc, false);
        SignatureUtil signUtil = new SignatureUtil(pdfDoc);
        List<String> names = signUtil.getSignatureNames();
        System.out.println("Signature Name>>>"+names);
      //  System.out.println("Singature Data>>"+signUtil.readSignatureData("Signature1"));


        PdfReader reader1 = new PdfReader(filePath);
        PdfDocument pdfDoc1 = new PdfDocument(reader1, new PdfWriter("C:\\\\\\\\abc\\\\\\\\test\\\\\\\\unsigned_latest_iext7.pdf"));
        PdfAcroForm form1 = PdfAcroForm.getAcroForm(pdfDoc1, true);
        form1.flattenFields();
        pdfDoc1.close();


    }

}

******************************

We are looking to sign PDF document. here are steps as per my understanding.

Consumer will send a digest of PDF document to Central System. The digest of the PDF will exclude the signature section
Central System will send the digest (signed using consumer’s private key/public key ? not sure) to consumer
consumer system will add digest within in the signature section the PDF document (may be along with public key ??)

Can you please help on following.

If my understanding is correct with above flow? Any small reference guide / link will help or any flow diagram.
With .Net and Java what are libraries which can do this work ?Both open source and paid. Will iTextSharp is relevant here?
How validation will happen if customer opens the PDF? If there any specific action required document signing?

Plz help.

"If my understanding is correct with above flow?" - How can we tell? You have to know your use cases. In particular, what kind of solution are you looking for: Shall it be an application running on the client side only? Shall it be a web service with only a web browser on client side? Shall it be some combination of client and server side programs? Furthermore, where are the private keys? On the client computer? On your server? On some external, third-party sign server? — mkl
"With .Net and Java what are libraries which can do this work ?Both open source and paid. Will iTextSharp is relevant here?" - Strictly speaking library recommendations meanwhile are off-topic on stack overflow. the Stack Exchange Software Recommendation site might be appropriate for this question. — mkl
"How validation will happen if customer opens the PDF? If there any specific action required document signing?" - First of all, it depends on the nature of the signing certificates used. Are they self-signed? Are they signed by a certificate authority not generally trusted? Or are they signed by a certificate authority on the AATL or EUTL? Furthermore, which is the legal context that validation shall execute in? And in which PDF processors do you expect your signed PDFs to be opened and validated? — mkl
Hi mkl, We are looking for small utility application running on server side, which exposes web services to which we can supply the PDF, encrypted , public/private key, which it will embed sign in PDF and send the raw data back to service consumer. — AlwaysDeveloper

mkl mkl · Accepted Answer · 2019-11-11T18:08:26

There are very many aspects and sub-questions here, both in the question text and in the comments thereunder. This answer illuminates but some of them after first presenting some backgrounds.

Some backgrounds

An integrated PDF signature implies the presence of a number of structures in a PDF:

A signature AcroForm form field. This form field can have a widget annotation (a visualization which can contain any information you care to put into it) but it does not need to have one.
A value in this signature form field. Unlike other form fields, the value of a signature field is not a mere string but a dictionary of key-value pairs. The contents differ depending on the exact type of signature. In case of the interoperable types, though, there always is a Contents entry whose value is a binary string containing the actual PKCS1/PKCS7/CMS/RFC3161 signature or time stamp which covers the whole file except this binary string.

(The sketch is a bit misleading: the '<' and '>' hex string delimiters are not part of the signed data.)
In case of type adbe.x509.rsa_sha1 the Contents entry contains a PKCS1 signature. The signature value dictionary additionally must contain a Cert entry containing the signing certificate.
In case of type ETSI.RFC3161 the Contents entry contains a RFC 3161 time stamp token.
In case of types ETSI.CAdES.detached, adbe.pkcs7.detached, and adbe.pkcs7.sha1 the Contents entry contains a CMS signature container. As the signature container can hold the signing certificate, there is no need for a Cert entry for the signing certificate.

A CMS signature container can contain a structure of "signed attributes". If it does, one of these attributes must be the hash of the signed PDF bytes (see above, everything but the Contents value) and the actual signature bytes wrapped in the container sign these signed attributes. Whether the variant without signed attributes is allowed and which attributes additionally are required, depends on the exact type of the signature.
In case of ETSI.CAdES.detached, the CMS container must contain signed attributes. Furthermore, one of the signed attributes must be an ESS signing-certificate or signing-certificate-v2 attribute referencing the signer certificate.

LTV information in this case can be added later in an incremental update to the PDF, they need not be present in the signed PDF.
In case of adbe.pkcs7.detached and adbe.pkcs7.sha1 there generically do not need to be signed attributes. Depending on the exact signing policy (prescribed by law or contract), though, signed attributes and in particular the ESS signing certificate signed attribute may be required nonetheless.

These signature types were already defined in ISO 32000-1. If one's signature policy is based on ISO 32000-1 alone, LTV information must be stored in the adbe-revocationInfoArchival attribute which must be a signed attribute.

Is the signing certificate required before signing?

In comments you reference the iText "PDF and Digital Signatures" ebook which appears to say that it suffices to retrieve the signing certificate together with the signature.

In the light of the backgrounds explained above, though, we realize that

In case of adbe.x509.rsa_sha1 signatures, the signing certificate must be in the value of the Cert entry of the signature value dictionary. As this entry is not in the Contents entry, this certificate is part of the signed data. Thus, it must be known before signing.
In case of ETSI.CAdES.detached signatures, the signed attributes must contain an ESS signing-certificate or signing-certificate-v2 attribute. This attribute references the signer certificate. Thus, it must be known before signing.
In case of adbe.pkcs7.detached and adbe.pkcs7.sha1 it depends on the actual signature policy one has to adhere to whether or not an ESS signing-certificate or signing-certificate-v2 attribute is required. Thus, it depends whether or not the signing certificate needs to be known before signing.

In case of a signature policy based on ISO 32000-1 alone, though, LTV information must be stored in a signed attribute if at all, and to retrieve LTV information, one obviously needs to know the certificates for which one attempts to retrieve them, in particular the signer certificate.

To answer the question in this topic's header, therefore: Only in context of a lax signature policy you can get away with not knowing the signer certificate before signing as long as you don't need to add LTV information.

And in case of PAdES signatures?

In a comment you mention you need to use PAdES and LTV. Does that mean you need the signer certificate before signing?

Well, it depends.

If using PAdES means using PAdES baseline profiles or extended PAdES profiles (BES/EPES), you have to create ETSI.CAdES.detached signatures. Thus, you do need the signer certificate before signing.

But if it only requires the PAdES profile for CMS digital signatures in PDF (essentially the ISO 32000-1 compatibility profile), you don't need the signer certificate before signing.

But this profile implies in particular: If present, any revocation information shall be a signed attribute of the PDF Signature. Thus, for "PAdES and LTV" you again do need the signer certificate before signing.

How to create a PDF signature without knowing the signer certificate early

So there are setups in which you shouldn't need the signer certificate before calculating the actual signature. Usually, though, security APIs require the certificate early nonetheless.

Using Bouncy Castle low-level APIs you can do that as follows. (I assume you are using SHA256withRSA.)

First prepare the PDF and determine the hash value

byte[] Hash = null;

using (PdfReader reader = new PdfReader("original.pdf"))
using (FileStream fout = new FileStream("prepared.pdf", FileMode.Create))
{
    StampingProperties sp = new StampingProperties();
    sp.UseAppendMode();

    PdfSigner pdfSigner = new PdfSigner(reader, fout, sp);
    pdfSigner.SetFieldName("Signature");

    PdfSignatureAppearance appearance = pdfSigner.GetSignatureAppearance();
    appearance.SetPageNumber(1);

    int estimatedSize = 12000;
    ExternalHashingSignatureContainer container = new ExternalHashingSignatureContainer(PdfName.Adobe_PPKLite, PdfName.Adbe_pkcs7_detached);
    pdfSigner.SignExternalContainer(container, estimatedSize);
    Hash = container.Hash;
}

Now the hash of the PDF bytes to sign is in Hash.

The ExternalHashingSignatureContainer class used here is the following helper class:

public class ExternalHashingSignatureContainer : ExternalBlankSignatureContainer
{
    public ExternalHashingSignatureContainer(PdfName filter, PdfName subFilter) : base(filter, subFilter)
    { }

    public override byte[] Sign(Stream data)
    {
        SHA256 sha = new SHA256CryptoServiceProvider();
        Hash = sha.ComputeHash(data);
        return new byte[0];
    }

    public byte[] Hash { get; private set; }
}

For the hash calculated above in the Hash variable you can now request a PKCS#1 signature and the signer certificate. Then you can construct the CMS container as follows:

byte[] signatureBytes = THE_RETRIEVED_SIGNATURE_BYTES;
byte[] certificateBytes = THE_RETRIEVED_CERTIFICATE_BYTES;

X509Certificate x509Certificate = new X509CertificateParser().ReadCertificate(certificateBytes);

SignerIdentifier sid = new SignerIdentifier(new IssuerAndSerialNumber(x509Certificate.IssuerDN, x509Certificate.SerialNumber));
AlgorithmIdentifier digAlgorithm = new AlgorithmIdentifier(NistObjectIdentifiers.IdSha256);
Attributes authenticatedAttributes = null;
AlgorithmIdentifier digEncryptionAlgorithm = new AlgorithmIdentifier(Org.BouncyCastle.Asn1.Pkcs.PkcsObjectIdentifiers.Sha256WithRsaEncryption);
Asn1OctetString encryptedDigest = new DerOctetString(signatureBytes);
Attributes unauthenticatedAttributes = null;
SignerInfo signerInfo = new SignerInfo(sid, digAlgorithm, authenticatedAttributes, digEncryptionAlgorithm, encryptedDigest, unauthenticatedAttributes);

Asn1EncodableVector digestAlgs = new Asn1EncodableVector();
digestAlgs.Add(signerInfo.DigestAlgorithm);
Asn1Set digestAlgorithms = new DerSet(digestAlgs);
ContentInfo contentInfo = new ContentInfo(CmsObjectIdentifiers.Data, null);
Asn1EncodableVector certs = new Asn1EncodableVector();
certs.Add(x509Certificate.CertificateStructure.ToAsn1Object());
Asn1Set certificates = new DerSet(certs);
Asn1EncodableVector signerInfs = new Asn1EncodableVector();
signerInfs.Add(signerInfo);
Asn1Set signerInfos = new DerSet(signerInfs);
SignedData signedData = new SignedData(digestAlgorithms, contentInfo, certificates, null, signerInfos);

contentInfo = new ContentInfo(CmsObjectIdentifiers.SignedData, signedData);

byte[] Signature = contentInfo.GetDerEncoded();

Now the CMS signature container bytes are in Signature.

For the above please use these BouncyCastle usings

using Org.BouncyCastle.Asn1;
using Org.BouncyCastle.Asn1.Cms;
using Org.BouncyCastle.Asn1.Nist;
using Org.BouncyCastle.Asn1.X509;
using Org.BouncyCastle.Crypto;
using Org.BouncyCastle.Crypto.Signers;
using Org.BouncyCastle.Pkcs;
using Org.BouncyCastle.X509;

You can now embed the signature container bytes into the PDF like this:

using (PdfReader reader = new PdfReader("prepared.pdf"))
using (PdfDocument document = new PdfDocument(reader))
using (FileStream fout = new FileStream("signed.pdf", FileMode.Create))
{
    PdfSigner.SignDeferred(document, "Signature", fout, new ExternalPrecalculatedSignatureContainer(Signature));
}

The ExternalPrecalculatedSignatureContainer class used here is the following helper class:

public class ExternalPrecalculatedSignatureContainer : ExternalBlankSignatureContainer
{
    public ExternalPrecalculatedSignatureContainer(byte[] cms) : base(new PdfDictionary())
    {
        Cms = cms;
    }

    public override byte[] Sign(Stream data)
    {
        return Cms;
    }

    public byte[] Cms { get; private set; }
}

As mentioned above, though, this signature container is not a CAdES container. Thus, your PDF signatures won't be true PAdES signature (baseline or extended profiles) but at best ISO 32000-1 compatibility PAdES signatures.

The problem in your test code based on the above

Your Client method createSignedData looks like this:

public byte[] createSignedData(byte[] sh)
{
    string dire = Directory.GetParent(Directory.GetParent(Directory.GetCurrentDirectory()).ToString()).ToString();
    string PROPERTIES = dire + "\\resources\\signkey.properties";
    Properties properties = new Properties();
    properties.Load(new FileStream(PROPERTIES, FileMode.Open, FileAccess.Read));
    String path = properties.GetProperty("PRIVATE");
    char[] pass = properties.GetProperty("PASSWORD").ToCharArray();
    string alias = null;
    Pkcs12Store pk12;
    pk12 = new Pkcs12Store(new FileStream(path, FileMode.Open, FileAccess.Read), pass);
    foreach (var a in pk12.Aliases)
    {
        alias = ((string)a);
        if (pk12.IsKeyEntry(alias))
            break;
    }

    ICipherParameters pk = pk12.GetKey(alias).Key;
    IExternalSignature pks = new PrivateKeySignature(pk, DigestAlgorithms.SHA256);
    byte[] data = pks.Sign(sh);
    return data;

}

Unfortunately PrivateKeySignature.Sign expects the message to sign in the sh parameter and in particular first hashes it. In your use case on the other hand sh already is the hash of the message to sign. Thus, you effectively hash twice where you should hash but once.

You can fix this by replacing

IExternalSignature pks = new PrivateKeySignature(pk, DigestAlgorithms.SHA256);
byte[] data = pks.Sign(sh);

in the code above by

StaticDigest digest = new StaticDigest();
digest.AlgorithmName = "SHA-256";
digest.Digest = sh;
RsaDigestSigner signer = new RsaDigestSigner(digest);
signer.Init(true, pk);
byte[] data = signer.GenerateSignature();

Here StaticDigest is the following helper class:

public class StaticDigest : IDigest
{
    public string AlgorithmName { get; set; }
    public byte[] Digest { get; set; }

    public void BlockUpdate(byte[] input, int inOff, int length)
    { }

    public int DoFinal(byte[] output, int outOff)
    {
        Array.Copy(Digest, 0, output, outOff, Digest.Length);
        return Digest.Length;
    }

    public int GetByteLength()
    {
        return 64;
    }

    public int GetDigestSize()
    {
        return Digest.Length;
    }

    public void Reset()
    { }

    public void Update(byte input)
    { }
}

After this change your test project returns mathematically valid signatures.