1
votes

I am trying to use Tika in python to parse PDF files. I am using python 2.7 and a Mac. I cannot get it to work. I have installed it, then:

from tika import parser
raw = parser.from_file('...file')

I get this error (edited for brevity):

Retrieving http://search.maven.org/remotecontent ... to /var/folders/... [MainThread  ] [INFO ]  Retrieving http:// ... [MainThread  ] [WARNI]  Failed to see startup log message; retrying...
...
2019-04-08 14:53:05,910 [MainThread  ] [ERROR]  Tika startup log message not received after 3 tries.
2019-04-08 14:53:05,916 [MainThread  ] [ERROR]  Failed to receive startup confirmation from startServer.

My question is very similar to that here Use tika with python, runtimeerror: unable to start tika server. The top answer, though, doesn't work for me. I have installed Java 8, but it still doesn't work. What should I do?

1
If you grab the Tika App runnable jar manually, and try to run that directly (eg java -jar apache-tika-1.20.jar), does that work fine? - Gagravarr
I might be doing things wrong. I went to tika.apache.org/download.html and downloaded tika-server-1.20.jar. I then ran java - jar 'filepath to tika-server-1.20.jar'. I got this error: Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/tika/server/TikaServerCli : Unsupported major.minor version 52.0. I did the same thing and got a similar error with tika-app1.20.jar` (Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/tika/cli/TikaCLI : Unsupported major.minor version 52.0). - bill999
That means your version of Java is too old. Upgrade! Apache Tika needs Java 8+ - Gagravarr
I thought I had upgraded (I did so yesterday). When I go to Java Control Panel, it is Version 8 Update 201 (build 1.8.0_201-b09). But when I go to Terminal and do java -version, it says java version "1.6.0_65". What to do? - bill999
Uninstall the old version of Java 6? Helping you with Java on Windows isn't really a Tika problem though, so you really need a new questions! - Gagravarr

1 Answers

2
votes

Not sure you still have problem with this - or for anyone else coming here. Even though you installed Java 8 (from Oracle or so), the terminal still see the old java that comes with OSX.

You need to tell the terminal to use the new Java you have just installed. Put this into your .bash_profile

export JAVA_HOME="/Library/Internet Plug-Ins/JavaAppletPlugin.plugin/Contents/Home/"

else, check System Preference > Java > Java > View > Path

you can see the path for Java, copy everything up to Home/ and paste it to export JAVA_HOME=""

Restart your terminal and tike should work now