0
votes

I want to read the callout text boxes in a PDF. I'm using iTextSharp to iterate through all the annotations as follows:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using iTextSharp.text;
using iTextSharp.text.pdf;
using System.IO;

namespace PDFAnnotationReader
{
    class Program
    {
        static void Main(string[] args)
        {
            StringBuilder text = new StringBuilder();
            string fileName = @"C:\Users\J123\Desktop\xyz.pdf";
            PdfReader pdfReader = new PdfReader(fileName);
            PdfDictionary pageDict = pdfReader.GetPageN(1);
            PdfArray annotArray = pageDict.GetAsArray(PdfName.ANNOTS);
            for (int i=0;i<annotArray.Size;i++)
            {
                PdfDictionary curAnnot = annotArray.GetAsDict(i);
            }
        }
    }

Examining the hashMap of curAnnot, I see that when I get to an annotation that is a callout text box, the dictionary includes the following key-value pairs:

{[/IT,/FreeTextCallout]}
{[/Contents,xyz this is a callout]}

So I think what I should do is check each annotation to see if it includes the key /IT with the value /FreeTextCallout and if so, get the value of /Contents as a string like so:

if (curAnnot.Contains(PdfName.IT))
{
    if (curAnnot.Get(PdfName.IT)==PdfName.FREETEXTCALLOUT)
    {
        Console.Writeline(curAnnot.Get(PdfName.CONTENTS).ToString());
    }
}

But there doesn't seem to be a PdfName.IT or PdfName.FREETEXTCALLOUT. How do I check for the existence of /IT and retrieve its value?

1

1 Answers

1
votes

You can create your own PdfName objects using the constructor on PdfName:

new PdfName("IT");

So:

var myPdfNameIT = new PdfName("IT");
if (curAnnot.Contains(myPdfNameIT)) {
    //...
}