2
votes

The PDF reference (12.3.3, Table 153) says that an outline will either have a Dest dictionary, an A dictionary, or an SE dictionary (with an optional Dest dictionary). I have a document with outlines that only have the SE dictionary. The reference directs me to the Structure Hierarchy (14.7.2), and this is where I've become lost. Structure items may have Pg dictionaries, but mine don't (and neither do the parents, on a recursive traversal). So I need to figure out the page number, offset in page and zoom (if applicable) from the SE dictionary. Any ideas?

Thanks!

1

1 Answers

3
votes

The structure tree has a root node that is a bit special. It contains a few entries that the rest of the nodes do not have. There is one entry called ParentTree that is used for locating structure elements that correspond to pages, annotations and XForms. See 14.7.2 Structure Hierarchy and 14.7.4.4 Finding Structure Elements from Content Items.

The references between pages and structure elements go both ways, you are supposed to have the Pg entry in one structure element or one of its recursive parents (as far as I know if this is not the case the file can be considered as corrupt), and you will have an entry called StructParents in the page dictionary that contains the index that corresponds to that page in the ParentTree structure.

If the StructParents entry is missing inside the page dictionaries and the structure tree is present, then the file is most likely corrupt. With "corrupt" I mean that the information it contains is not consistent, coherent etc.

There are two ways of using this information then. If you have a reference to a structure element, you should be able to find the page it refers to by using the Pg entry. If you have a reference to a page, you should be able to use its corresponding structure element by using the ParentTree structure. Both directions are supposed to be present in the file for the information to be consistent.