0
votes

I am doing one RND where i want to store my RDD to hive table. I have wirtten the code in Java and creating the RDD. After converting the RDD i am converting it to Data Frame and then store it in Hive table. But here i am facing two kind of different errors.

 public static void main(String[] args) {  


  SparkConf sparkConf = new SparkConf().setAppName("SparkMain");
  JavaSparkContext ctx = new JavaSparkContext(sparkConf);
  HiveContext  hiveContext = new HiveContext(ctx.sc());
  hiveContext.setConf("hive.metastore.uris", "thrift://address:port");
  DataFrame df = hiveContext.read().text("/filepath");
  df.write().saveAsTable("catAcctData");
  df.registerTempTable("catAcctData");
  DataFrame sql = hiveContext.sql("select * from catAcctData");
  sql.show();
  ctx.close();

}

If i am executing this program, it is working perfectly fine. I can see the table data in console.

But if i try below code it is saying org.apache.spark.sql.AnalysisException: Table not found: java

 public static void main(String[] args) {  


  SparkConf sparkConf = new SparkConf().setAppName("SparkMain");
  JavaSparkContext ctx = new JavaSparkContext(sparkConf);
  HiveContext  hiveContext = new HiveContext(ctx.sc());
  hiveContext.setConf("hive.metastore.uris", "thrift://address:port");
  DataFrame sql = hiveContext.sql("select * from catAcctData");
  sql.show();
  ctx.close();

}

And if i try to save the table data using sqlContext it is saying java.lang.RuntimeException: Tables created with SQLContext must be TEMPORARY. Use a HiveContext instead.

 public static void main(String[] args) {  
  SparkConf sparkConf = new SparkConf().setAppName("SparkMain");
  JavaSparkContext ctx = new JavaSparkContext(sparkConf);
  SQLContext  hiveContext = new SQLContext(ctx.sc());
  hiveContext.setConf("hive.metastore.uris", "thrift://address:port");
  DataFrame df = hiveContext.read().text("/filepath");
  df.write().saveAsTable("catAcctData");
  df.registerTempTable("catAcctData");
  DataFrame sql = hiveContext.sql("select * from catAcctData");
  sql.show();
  ctx.close();

}

I am bit confuse here. Please solve my query.

Regards, Pratik

1

1 Answers

1
votes

Your problem is that you create your table using different HiveContext. In other words, HiveContext from the second program doesn't see "catAcctData" table because you've created this table with another HiveContext. Use one HiveContext for creating and reading tables.

Also I don't understand why you do this df.write().saveAsTable("catAcctData"); before creating temporary table. If you want to create temporary table you just need to use df.registerTempTable("catAcctData"); withoutdf.write().saveAsTable("catAcctData");.