We're building a PDF search machine with Solr and Lucene where users can search for text in PDFs. The database only contains PDFs.
In the search results page ("/browse") we want to append the PDF file with #page=X where X is the page the text was found on. (Adobe Acrobat automatically scrolls to a certain page if specified with an anchor tag.)
For example, if I search for foobar
and there's a pdf document where foobar
is on page 5, the link should be http://pdfserver/pdfs/pdf.pdf#page=5
(note the anchor at the end).
- Is this possible?
- How would we get this page number?