1
votes

I am trying to use Janusgraph with Cassandra following the guide https://www.bluepiit.com/blog/janusgraph-with-cassandra/ . But I am receiving error while starting gremlin:

C:\Homes\janusgraph-0.2.3-hadoop2\bin>gremlin
HADOOP_HOME is not set.
Download http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe
Place it under C:\Homes\janusgraph-0.2.3-hadoop2\bin\winutils.exe
Press any key to continue . . .

The tutorial (and also the Janusgraph documentation https://docs.janusgraph.org/latest/cassandra.html does not mention that I should set HADOOP_HOME and event if I should, what is the sense to configure Hadoop home if I am willing to use Cassandra? Maybe I should fake the gramlin and set HADOOP_HOME to Cassandra installation? Besides, Janusgraph has configuration files for each of the backends but I can not find the global janusgraph single configuration file in which I could indicate what backend to use?

1

1 Answers

3
votes

Maybe I should fake the gremlin and set HADOOP_HOME to Cassandra installation?

Inside the gremlin.bat I can see the check that you're failing on.

:: Hadoop winutils.exe needs to be available because hadoop-gremlin is installed and active by default
IF NOT DEFINED HADOOP_HOME (
    SET JANUSGRAPH_WINUTILS=%JANUSGRAPH_HOME%\bin\winutils.exe
    IF EXIST !JANUSGRAPH_WINUTILS! (
        SET HADOOP_HOME=%JANUSGRAPH_HOME%
    ) ELSE (
        ECHO HADOOP_HOME is not set.
        ECHO Download http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe
        ECHO Place it under !JANUSGRAPH_WINUTILS!
        PAUSE
        GOTO :eof
    )
)

If you insist on running JanusGraph on Windows, you'll need to follow the line about downloading winutils.exe from hortonworks.com, and copying it to C:\Homes\janusgraph-0.2.3-hadoop2\bin\.

As to getting JanusGraph to use Cassandra, that's something you need to specify in the conf/gremlin/gremlin-server.yaml file.

Specifically, I have set:

channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer
graphs: {
  graph: conf/gremlin-server/janusgraph-cql-server.properties
}

Inside the janusgraph-cql-server.properties file is where you specify your the connection info for your Cassandra cluster.

Then, I would run bin/gremlin-server.bat, instead of gremlin.bat.

Here are some other observations:

  • Use the latest version of JanusGraph, which I'm pretty sure is 0.3.1.
  • Connect with CQL instead of Thrift, if you can. The next major version of Cassandra will not even include Thrift, so don't grow attached to it.
  • Build JanusGraph and Cassandra on Linux. You are setting yourself up for travel on a long road of suffering by using Windows for this.

Hope this helps!