Create Spark dataframe selecting from an Impala table
sql_df1 = hive_context.sql("SELECT * FROM database1.table1 LIMIT 10")
1.1 This dataframe returns row count 10 and shows correct data: sql_df1
print(sql_df1.count())
sql_df1.show()
create a new table from the first Spark dataframe
sql_df1.write.mode("overwrite").format("parquet").saveAsTable("database1.table2")
Refresh Metadata in impala, In HUE i can see database1.table2 has 10 rows of correct data
Create new Spark dataframe with the new table.
sql_df2 = hive_context.sql("SELECT * FROM database1.table2 LIMIT 10")
ISSUE: The new sql_df2 has no rows, only headers.
print(sql_df2.count()) sql_df2.show()