I'm working on a PDF accessibility assignment, which is to add alternative text in a tagged PDF. I got the sample code for the same at: Add alternative text for an image in tagged pdf (PDF/UA) using iText
Very much excited about that my task is going to end in a very short time, without much R&D.
Created a Java project based on the code, and when I executed it, it worked perfectly for the input PDF used in iText.
Unfortunately, the same source code is not working with PDFs tagged using Acrobat.
Sample Inputs: iText PDF: no_alt_attribute.pdf & My PDF: SARO_Sample_v1.7.pdf
Issue:
// This line works and returns RootElement
PdfDictionary structTreeRoot = catalog.getAsDict(PdfName.STRUCTTREEROOT);
// --> This line always returns NULL,
// Instead of returning the child elements of RootElement
PdfArray kids = structTreeRoot.getAsArray(PdfName.K);
// --> As per the structure Kids are present
Compared the structure of both PDFs and the following are my observations:
- Tagging Structure - exactly same in both PDFs Tagging Structure
- Content Structure - almost same, but a few additions are available in the PDF created by me. Content Structure
- Tag Tree Structure - almost same respective to Tags, but with a major difference: iText's PDF tags are marked with
/T:StructElem
whereas that's not found in MY-PDF Even re-tagging doesn't help. Tag Tree Structure
Verified with various tagged PDFs available with us and all are similar (without /T:StructElem
). These PDFs are validated and have passed accessibility compliance.
Need some thoughts on how to make this source code work with the PDFs we have. Alternatively, I need a way to ADD the missing /T:StructElem
automatically in the PDFs while tagging in Acrobat.
Any help will be much appreciated!
Please do let me know if any further information is needed.
Note: I'm still not sure adding this /T:StructElem
will work, since the PDFs were passed in PAC.
If this is really an issue, then those PDFs wont be passed the validations, right? But this is the only difference I found between those two PDFs.
PS: The Acrobat version I'm using is "Adobe Acrobat (Pro) DC."
-- Thanks,
SaRaVaNaN
/T:StructElem
- PDFDebugger by these markers indicates that the referenced dictionary has a Type member with value StructElem. For structure element dictionaries this is an optional entry. Thus, this is not the problem. - mkl