0
votes

HI i am looking to extract meta data about word files like number of pages using apache tika on command line, How can i do this?

1

1 Answers

1
votes

Hi guys i figured it out,

I had to download the tika-app-1.5.jar and execute the following command which returned me all the details i wanted

java -jar tika-app-1.5.jar -m test.docx
java -jar tika-app-1.5.jar -m test.doc
java -jar tika-app-1.5.jar -m test.pptx
java -jar tika-app-1.5.jar -m test.ppt