55
votes

I've spent all day trying to get hyperlinks metadata from PDFs in my iPad application. The CGPDF* APIs are a true nightmare, and the only piece of information I've found on the net about all this is that I have to look for an "Annots" dictionary, but I just can't find it in my PDFs.

I even used the old Voyeur Xcode sample to inspect my test PDF file, but no trace of this "Annots" dictionary...

You know, this is a feature I see on every PDF reader - this same question has been asked multiple times here with no real practical answers. I usually never ask for sample code directly but apparently this time I really need it... anyone got this working, possibly with sample code?

Update: I just realized the guy who has done my testing PDF had just inserted an URL as text, and not a real annotation. He tried putting an annotation and my code works now... But that's not what I need, so it seems I'll have to analyze text and search for URLs. But that's another story...

Update 2: So I finally came up with some working code. I'm posting it here so hopefully it'll help someone. It assumes the PDF document actually contains annotations.

for(int i=0; i<pageCount; i++) {
    CGPDFPageRef page = CGPDFDocumentGetPage(doc, i+1);

    CGPDFDictionaryRef pageDictionary = CGPDFPageGetDictionary(page);

    CGPDFArrayRef outputArray;
    if(!CGPDFDictionaryGetArray(pageDictionary, "Annots", &outputArray)) {
        return;
    }

    int arrayCount = CGPDFArrayGetCount( outputArray );
    if(!arrayCount) {
        continue;
    }

    for( int j = 0; j < arrayCount; ++j ) {
        CGPDFObjectRef aDictObj;
        if(!CGPDFArrayGetObject(outputArray, j, &aDictObj)) {
            return;
        }

        CGPDFDictionaryRef annotDict;
        if(!CGPDFObjectGetValue(aDictObj, kCGPDFObjectTypeDictionary, &annotDict)) {
            return;
        }

        CGPDFDictionaryRef aDict;
        if(!CGPDFDictionaryGetDictionary(annotDict, "A", &aDict)) {
            return;
        }

        CGPDFStringRef uriStringRef;
        if(!CGPDFDictionaryGetString(aDict, "URI", &uriStringRef)) {
            return;
        }

        CGPDFArrayRef rectArray;
        if(!CGPDFDictionaryGetArray(annotDict, "Rect", &rectArray)) {
            return;
        }

        int arrayCount = CGPDFArrayGetCount( rectArray );
        CGPDFReal coords[4];
        for( int k = 0; k < arrayCount; ++k ) {
            CGPDFObjectRef rectObj;
            if(!CGPDFArrayGetObject(rectArray, k, &rectObj)) {
                return;
            }

            CGPDFReal coord;
            if(!CGPDFObjectGetValue(rectObj, kCGPDFObjectTypeReal, &coord)) {
                return;
            }

            coords[k] = coord;
        }               

        char *uriString = (char *)CGPDFStringGetBytePtr(uriStringRef);

        NSString *uri = [NSString stringWithCString:uriString encoding:NSUTF8StringEncoding];
        CGRect rect = CGRectMake(coords[0],coords[1],coords[2],coords[3]);

        CGPDFInteger pageRotate = 0;
        CGPDFDictionaryGetInteger( pageDictionary, "Rotate", &pageRotate ); 
        CGRect pageRect = CGRectIntegral( CGPDFPageGetBoxRect( page, kCGPDFMediaBox ));
        if( pageRotate == 90 || pageRotate == 270 ) {
            CGFloat temp = pageRect.size.width;
            pageRect.size.width = pageRect.size.height;
            pageRect.size.height = temp;
        }

        rect.size.width -= rect.origin.x;
        rect.size.height -= rect.origin.y;

        CGAffineTransform trans = CGAffineTransformIdentity;
        trans = CGAffineTransformTranslate(trans, 0, pageRect.size.height);
        trans = CGAffineTransformScale(trans, 1.0, -1.0);

        rect = CGRectApplyAffineTransform(rect, trans);

        // do whatever you need with the coordinates.
        // e.g. you could create a button and put it on top of your page
        // and use it to open the URL with UIApplication's openURL
    }
}
3
line 6, should that not be continue instead of return? - why do you return after checking object,value,dict,string,array etc.Luke Mcneice
That's just example code without any error checking.pt2ph8
PDF rects dont translate to native rects see my thread for details: scroll down to to: 'Other PDF Features','Getting Links inside a PDF', 'Understanding the PDF Rect for link positioning' stackoverflow.com/questions/3889634/…Luke Mcneice
I'm doing rect.size.width -= rect.origin.x; rect.size.height -= rect.origin.y; to fix that, it's working for me..pt2ph8
Yea that works for w&h but the pdf spec states: the array takes the form [llx lly urx ury] specifying the lower-left x, lower-left y, upper-right x, and upper-right y coordinates of the rectangle, in that order. This means that your rect.origin.y is actually rect.origin.y+rect.size.height as the adobe rect is the bottom left and not the top left defaulted by CGRect. It may not have been that noticable as it would probably only been 20-30 px out and still registered your pressLuke Mcneice

3 Answers

15
votes

heres the basic idea to get to the annots CGPDFDictionary for each page atleast. after that you should be able to figure it out with help from the PDF spec from Adobe.

1.) get the CGPDFDocumentRef.

2.) get each page.

3.) on each page, use CGPDFDictionaryGetArray(pageDictionary, "Annots", &outputArray) where pageDictionary is the CGPDFDictionary representing the CGPDFPage, and outputArray is the variable (CGPDFArrayRef) to store the Annots array of that page in.

9
votes

Great code but I am having a little trouble working it into my project. It gets all the URL's correctly but when I click on it nothing happens. Here is my code I had to modify yours slightly to work with my project). Is there something missing:

- (void) renderPageAtIndex:(NSUInteger)index inContext:(CGContextRef)ctx {
//CGPDFPageRef page = CGPDFDocumentGetPage(pdf, index+1);

CGPDFPageRef page = CGPDFDocumentGetPage(pdf, index+1);
CGAffineTransform transform1 = aspectFit(CGPDFPageGetBoxRect(page, kCGPDFMediaBox),
                                         CGContextGetClipBoundingBox(ctx));
CGContextConcatCTM(ctx, transform1);
CGContextDrawPDFPage(ctx, page);

int pageCount = CGPDFDocumentGetNumberOfPages(pdf);
int i = 0;
while (i<pageCount) {
    i++;
    CGPDFPageRef page = CGPDFDocumentGetPage(pdf, i+1);

    CGPDFDictionaryRef pageDictionary = CGPDFPageGetDictionary(page);

    CGPDFArrayRef outputArray;
    if(!CGPDFDictionaryGetArray(pageDictionary, "Annots", &outputArray)) {
        return;
    }

    int arrayCount = CGPDFArrayGetCount( outputArray );
    if(!arrayCount) {
        continue;
    }

    for( int j = 0; j < arrayCount; ++j ) {
        CGPDFObjectRef aDictObj;
        if(!CGPDFArrayGetObject(outputArray, j, &aDictObj)) {
            return;
        }

        CGPDFDictionaryRef annotDict;
        if(!CGPDFObjectGetValue(aDictObj, kCGPDFObjectTypeDictionary, &annotDict)) {
            return;
        }

        CGPDFDictionaryRef aDict;
        if(!CGPDFDictionaryGetDictionary(annotDict, "A", &aDict)) {
            return;
        }

        CGPDFStringRef uriStringRef;
        if(!CGPDFDictionaryGetString(aDict, "URI", &uriStringRef)) {
            return;
        }

        CGPDFArrayRef rectArray;
        if(!CGPDFDictionaryGetArray(annotDict, "Rect", &rectArray)) {
            return;
        }

        int arrayCount = CGPDFArrayGetCount( rectArray );
        CGPDFReal coords[4];
        for( int k = 0; k < arrayCount; ++k ) {
            CGPDFObjectRef rectObj;
            if(!CGPDFArrayGetObject(rectArray, k, &rectObj)) {
                return;
            }

            CGPDFReal coord;
            if(!CGPDFObjectGetValue(rectObj, kCGPDFObjectTypeReal, &coord)) {
                return;
            }

            coords[k] = coord;
        }               

        char *uriString = (char *)CGPDFStringGetBytePtr(uriStringRef);

        NSString *uri = [NSString stringWithCString:uriString encoding:NSUTF8StringEncoding];
        CGRect rect = CGRectMake(coords[0],coords[1],coords[2],coords[3]);

        CGPDFInteger pageRotate = 0;
        CGPDFDictionaryGetInteger( pageDictionary, "Rotate", &pageRotate ); 
        CGRect pageRect = CGRectIntegral( CGPDFPageGetBoxRect( page, kCGPDFMediaBox ));
        if( pageRotate == 90 || pageRotate == 270 ) {
            CGFloat temp = pageRect.size.width;
            pageRect.size.width = pageRect.size.height;
            pageRect.size.height = temp;
        }

        rect.size.width -= rect.origin.x;
        rect.size.height -= rect.origin.y;

        CGAffineTransform trans = CGAffineTransformIdentity;
        trans = CGAffineTransformTranslate(trans, 0, pageRect.size.height);
        trans = CGAffineTransformScale(trans, 1.0, -1.0);

        rect = CGRectApplyAffineTransform(rect, trans);

        // do whatever you need with the coordinates.
        // e.g. you could create a button and put it on top of your page
        // and use it to open the URL with UIApplication's openURL
        NSURL *url = [NSURL URLWithString:uri];
        NSLog(@"URL: %@", url);
        CGPDFContextSetURLForRect(ctx, (CFURLRef)url, rect);
       // CFRelease(url);
        }
    }   


}

Thanks & great work BrainFeeder!

UPDATE:

For anybody using the leaves project in your app this is how I got the PDF links to work (it's not perfect as the rect seems to fill the entire screen but it's a start):

- (void) renderPageAtIndex:(NSUInteger)index inContext:(CGContextRef)ctx {

CGPDFPageRef page = CGPDFDocumentGetPage(pdf, index+1);
CGAffineTransform transform1 = aspectFit(CGPDFPageGetBoxRect(page, kCGPDFMediaBox),
                                         CGContextGetClipBoundingBox(ctx));
CGContextConcatCTM(ctx, transform1);
CGContextDrawPDFPage(ctx, page);


    CGPDFPageRef pageAd = CGPDFDocumentGetPage(pdf, index);

    CGPDFDictionaryRef pageDictionary = CGPDFPageGetDictionary(pageAd);

    CGPDFArrayRef outputArray;
    if(!CGPDFDictionaryGetArray(pageDictionary, "Annots", &outputArray)) {
        return;
    }

    int arrayCount = CGPDFArrayGetCount( outputArray );
    if(!arrayCount) {
        //continue;
    }

    for( int j = 0; j < arrayCount; ++j ) {
        CGPDFObjectRef aDictObj;
        if(!CGPDFArrayGetObject(outputArray, j, &aDictObj)) {
            return;
        }

        CGPDFDictionaryRef annotDict;
        if(!CGPDFObjectGetValue(aDictObj, kCGPDFObjectTypeDictionary, &annotDict)) {
            return;
        }

        CGPDFDictionaryRef aDict;
        if(!CGPDFDictionaryGetDictionary(annotDict, "A", &aDict)) {
            return;
        }

        CGPDFStringRef uriStringRef;
        if(!CGPDFDictionaryGetString(aDict, "URI", &uriStringRef)) {
            return;
        }

        CGPDFArrayRef rectArray;
        if(!CGPDFDictionaryGetArray(annotDict, "Rect", &rectArray)) {
            return;
        }

        int arrayCount = CGPDFArrayGetCount( rectArray );
        CGPDFReal coords[4];
        for( int k = 0; k < arrayCount; ++k ) {
            CGPDFObjectRef rectObj;
            if(!CGPDFArrayGetObject(rectArray, k, &rectObj)) {
                return;
            }

            CGPDFReal coord;
            if(!CGPDFObjectGetValue(rectObj, kCGPDFObjectTypeReal, &coord)) {
                return;
            }

            coords[k] = coord;
        }               

        char *uriString = (char *)CGPDFStringGetBytePtr(uriStringRef);

        NSString *uri = [NSString stringWithCString:uriString encoding:NSUTF8StringEncoding];
        CGRect rect = CGRectMake(coords[0],coords[1],coords[2],coords[3]);

        CGPDFInteger pageRotate = 0;
        CGPDFDictionaryGetInteger( pageDictionary, "Rotate", &pageRotate ); 
        CGRect pageRect = CGRectIntegral( CGPDFPageGetBoxRect( page, kCGPDFMediaBox ));
        if( pageRotate == 90 || pageRotate == 270 ) {
            CGFloat temp = pageRect.size.width;
            pageRect.size.width = pageRect.size.height;
            pageRect.size.height = temp;
        }

        rect.size.width -= rect.origin.x;
        rect.size.height -= rect.origin.y;

        CGAffineTransform trans = CGAffineTransformIdentity;
        trans = CGAffineTransformTranslate(trans, 0, pageRect.size.height);
        trans = CGAffineTransformScale(trans, 1.0, -1.0);

        rect = CGRectApplyAffineTransform(rect, trans);

            // do whatever you need with the coordinates.
            // e.g. you could create a button and put it on top of your page
            // and use it to open the URL with UIApplication's openURL
            NSURL *url = [NSURL URLWithString:uri];
            NSLog(@"URL: %@", url);
//          CGPDFContextSetURLForRect(ctx, (CFURLRef)url, rect);
            UIButton *button = [[UIButton alloc] initWithFrame:rect];
            [button setTitle:@"LINK" forState:UIControlStateNormal];
            [button addTarget:self action:@selector(openLink:) forControlEvents:UIControlEventTouchUpInside];
            [self.view addSubview:button];
           // CFRelease(url);
        }
    //} 

Final Update Below is the final code I used in my apps.

- (void) renderPageAtIndex:(NSUInteger)index inContext:(CGContextRef)ctx {
//If the view already contains a button control remove it
if ([[self.view subviews] containsObject:button]) {
    [button removeFromSuperview];
}

CGPDFPageRef page = CGPDFDocumentGetPage(pdf, index+1);
CGAffineTransform transform1 = aspectFit(CGPDFPageGetBoxRect(page, kCGPDFMediaBox),
                                         CGContextGetClipBoundingBox(ctx));
CGContextConcatCTM(ctx, transform1);
CGContextDrawPDFPage(ctx, page);


CGPDFPageRef pageAd = CGPDFDocumentGetPage(pdf, index);

CGPDFDictionaryRef pageDictionary = CGPDFPageGetDictionary(pageAd);

CGPDFArrayRef outputArray;
if(!CGPDFDictionaryGetArray(pageDictionary, "Annots", &outputArray)) {
    return;
}

int arrayCount = CGPDFArrayGetCount( outputArray );
if(!arrayCount) {
    //continue;
}

for( int j = 0; j < arrayCount; ++j ) {
    CGPDFObjectRef aDictObj;
    if(!CGPDFArrayGetObject(outputArray, j, &aDictObj)) {
        return;
    }

    CGPDFDictionaryRef annotDict;
    if(!CGPDFObjectGetValue(aDictObj, kCGPDFObjectTypeDictionary, &annotDict)) {
        return;
    }

    CGPDFDictionaryRef aDict;
    if(!CGPDFDictionaryGetDictionary(annotDict, "A", &aDict)) {
        return;
    }

    CGPDFStringRef uriStringRef;
    if(!CGPDFDictionaryGetString(aDict, "URI", &uriStringRef)) {
        return;
    }

    CGPDFArrayRef rectArray;
    if(!CGPDFDictionaryGetArray(annotDict, "Rect", &rectArray)) {
        return;
    }

    int arrayCount = CGPDFArrayGetCount( rectArray );
    CGPDFReal coords[4];
    for( int k = 0; k < arrayCount; ++k ) {
        CGPDFObjectRef rectObj;
        if(!CGPDFArrayGetObject(rectArray, k, &rectObj)) {
            return;
        }

        CGPDFReal coord;
        if(!CGPDFObjectGetValue(rectObj, kCGPDFObjectTypeReal, &coord)) {
            return;
        }

        coords[k] = coord;
    }               

    char *uriString = (char *)CGPDFStringGetBytePtr(uriStringRef);

    NSString *uri = [NSString stringWithCString:uriString encoding:NSUTF8StringEncoding];
    CGRect rect = CGRectMake(coords[0],coords[1],coords[2],coords[3]);
    CGPDFInteger pageRotate = 0;
    CGPDFDictionaryGetInteger( pageDictionary, "Rotate", &pageRotate ); 
    CGRect pageRect = CGRectIntegral( CGPDFPageGetBoxRect( page, kCGPDFMediaBox ));
    if( pageRotate == 90 || pageRotate == 270 ) {
        CGFloat temp = pageRect.size.width;
        pageRect.size.width = pageRect.size.height;
        pageRect.size.height = temp;
    }

    rect.size.width -= rect.origin.x;
    rect.size.height -= rect.origin.y;

    CGAffineTransform trans = CGAffineTransformIdentity;
    trans = CGAffineTransformTranslate(trans, 35, pageRect.size.height+150);
    trans = CGAffineTransformScale(trans, 1.15, -1.15);

    rect = CGRectApplyAffineTransform(rect, trans);

    urlLink = [NSURL URLWithString:uri];
    [urlLink retain];

    //Create a button to get link actions
    button = [[UIButton alloc] initWithFrame:rect];
    [button setBackgroundImage:[UIImage imageNamed:@"link_bg.png"] forState:UIControlStateHighlighted];
    [button addTarget:self action:@selector(openLink:) forControlEvents:UIControlEventTouchUpInside];
    [self.view addSubview:button];
}   
[leavesView reloadData];
}

}
0
votes

I must be confused, because this all works if I use:

CGRect rect = CGRectMake(coords[0],coords[1],coords[2]-coords[0]+1,coords[3]-coords[1]+1);

Am I misusing something later, perhaps? PDF supplies the corners, and CGRect wants a corner and a size.