I'm having a weird problem with Spark SQL on an external table defined in Hive with
CREATE EXTERNAL TABLE ... STORED AS PARQUET... LOCATION 'hdfs://path/TABLENAME'
If I refer to the table in Spark with spark.table("tablename") or spark.sql("select column from tablename") I get the right row count but every value is null.
When I query the table through Beeline, I get the right values.
Additionally, if I query the Parquet directly in Spark with spark.read.parquet("hdfs://path/TABLENAME") I also get the right answer.
To make it even stranger - if I create another external table with a similar CREATE EXTERNAL TABLE... statement against the same Parquet in HDFS, Spark SQL works.
Where do I look next?
select *orselect column- wrschneider