0
votes

Sqoop import action is giving error while running as an oozie job.

I am using a pesudo-distributed hadoop cluster. I have followed the following steps.

1.Started oozie server

2.edited job.properties and workflow.xml files

3.copied workflow.xml into hdfs

4.ran oozie job

my job.properties file

nameNode=hdfs://localhost:8020

jobTracker=localhost:8021

queueName=default

examplesRoot=examples

oozie.use.system.libpath=true

oozie.wf.application.path=${nameNode}/user/hduser/${examplesRoot}/apps/sqoop

workflow.xml file

<action name="sqoop-node">
    <sqoop xmlns="uri:oozie:sqoop-action:0.2">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <prepare>
            <delete path="${nameNode}/user/hduser/${examplesRoot}/output-data/sqoop"/>
            <!--<mkdir path="${nameNode}/user/hduser/${examplesRoot}/output-data"/>-->
        </prepare>
        <configuration>
            <property>
                <name>mapred.job.queue.name</name>
                <value>${queueName}</value>
            </property>
        </configuration>
        <command>import --connect "jdbc:mysql://localhost/db" --username user --password pass --table "table" --where "Conditions" --driver com.mysql.jdbc.Driver --target-dir ${nameNode}/user/hduser/${examplesRoot}/output-data/sqoop -m 1</command>
        <!--<file>db.hsqldb.properties#db.hsqldb.properties</file>
        <file>db.hsqldb.script#db.hsqldb.script</file>-->
    </sqoop>
    <ok to="end"/>
    <error to="fail"/>
</action>

<kill name="fail">
    <message>Sqoop failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>

I was expecting that the job will run without any errors. But the job got killed and it gave the following error.

UnsupportedOperationException: Accessing local file system is not allowed.

I don't understand where I am wrong and why it is not allowing to complete the job? Can Anyone help me to solve the issue.

1

1 Answers

1
votes

Oozie sharelib (with the Sqoop action's dependencies) is stored on HDFS, and the server needs to know how to communicate with the Hadoop cluster. Access to the sharelib stored on a local filesystem is not allowed, see CVE-2017-15712.

Please review conf/hadoop-conf/core-site.xml, and make sure it does not use the local filesystem. For example, if your HDFS namenode listens on port 9000 on localhost, configure fs.defaultFS accordingly.

  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9000</value>
  </property>
...
</configuration>

Alternatively, you can remove the RawLocalFileSystem class (dummy implementation) and restart the server, but it isn't recommended (i.e. server becomes vulnerable to CVE-2017-15712).

Hope this helps. Also see this answer.