Currently I am trying to use PDFBox in Eclipse to run multiple PDF files in a folder through a text reader that will extract certain terms and output them into a text file that I will then convert to an excel sheet. Currently I have the program and it works correctly for a single PDF file:
public static void main(String args[]) throws IOException {
//Loading an existing document
File file = new File("ADE_acetylfuranoside_120319_pfister.pdf");
PDDocument document = PDDocument.load(file);
//Instantiate PDFTextStripper class
PDFTextStripper pdfStripper = new PDFTextStripper();
//Retrieving text from PDF document
String text = pdfStripper.getText(document);
//..."Actual code that extracts text"...
PrintStream o = new PrintStream(new File("output.txt"));
PrintStream console = System.out;
System.setOut(o);
System.out.println(finalSheet);
my problem is that I want to run 500 PDFs in one folder through this program on eclipse rather than putting in the name of each one individually. I also want it to output like:
Name1, Number1, ID1 Name2, Number2, ID2
but I think the way it is written now it will just overwrite line number one if I run multiple PDFs though it.
Thanks for the help!