0
votes

I create a datasource in spark using scala. I have a case class and have created an RDD and registered it as a table. Just like the example given in the spark documentation.

val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.createSchemaRDD
case class Person(name: String, age: Int)
val people = sc.textFile("examples/src/main/resources/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt))
people.registerAsTable("people")
val teenagers = sqlContext.sql("SELECT name FROM people WHERE age >= 13 AND age <= 19")
teenagers.map(t => "Name: " + t(0)).collect().foreach(println)

However I cannot access the table in hive, impala or spark-sql. "Show tables" command does not show the tables. Any ideas on how to achieve this?

Thank you!

1

1 Answers

2
votes

There is no connection between you locally created tables and hive metastore.

To have access to you tables via hive you should generate parquet files somehow (your code is ok), add them to hive metastore (with create table ...) and next use it via hive connection or create hive context (org.apache.spark.sql.hive.HiveContext)

In short you should distinguish metadata used localy (create with registerTempTable) and persistent hive metadata (stored in metastore)