0
votes

I'm trying to import CloudSQL tables into GCS bucket using SQOOP. I've used the below jars..

kite-data-core-1.1.0.jar, kite-data-hive-1.1.0.jar, kite-data-mapreduce-1.1.0.jar, kite-hadoop-compatibility-1.1.0.jar.

below is my code snippet:

    ```sqoop import 
    -libjars=gs://BUCKET_NAME/kite-data-core-1.1.0.jar,gs://BUCKET_NAME/kite-data-mapreduce-1.1.0.jar,gs://BUCKET_NAME/kite-data-hive-1.1.0.jar,gs://BUCKET_NAME/kite-hadoop-compatibility-1.1.0.jar,gs://BUCKET_NAME/hadoop-mapreduce-client-core-3.2.0.jar 
    --connect=jdbc:mysql://IP/DB Name
     --username=sqoop_user 
    --password=sqoop_user 
    --target-dir=gs://BUCKET_NAME/mysql_output 
    --table persons 
    --split-by personid -m 2 
    --as-parquetfile```

I'm getting the below error...

20/01/03 04:42:29 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar Exception in thread "main" java.lang.NoClassDefFoundError: org/kitesdk/data/mapreduce/DatasetKeyOutputFormat at org.apache.sqoop.mapreduce.DataDrivenImportJob.getOutputFormatClass(DataDrivenImportJob.java:190) at org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:94) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:259) at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:673) at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) Caused by: java.lang.ClassNotFoundException: org.kitesdk.data.mapreduce.DatasetKeyOutputFormat at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) at java.lang.ClassLoader.loadClass(ClassLoader.java:351)

In the first line of error, it says ' mapred.jar is deprecated. Instead, use mapreduce.job.jar'...

I've imported mapreduce.job.jar and passed it as libjar argument, but the issue still remains the same.

Help in this issue is much appreciated.

1
This link might help. medium.com/google-cloud/…marjun
You can see here that mapred.jar is deprecated so instead of it, you've to import mapreduce.job.jar.Nibrass H
Can you share how are you importing the mapreduce.job.jar?Nibrass H
You can follow this Apache Hadoop Official Documentation and import all the necessary jarsNibrass H
And in Stackoverflow, there is a similar post which can help you.Nibrass H

1 Answers

0
votes

These are the specific jar versions that worked for me (mostly Cloudera):

Full script for the Sqoop job shared in this answer.