The following HQL works to create a Hive table in HDInsight which I can successfully query. But, I have several questions about WHY it works:
My data rows are, in fact, terminated by carriage return line feed, so why does 'COLLECTION ITEMS TERMINATED BY \002' work? And what is \002 anyway? And no location for the blob is specified so, again, why does this work?
All attempts at creating the same table and specifying "CREATE EXTERNAL TABLE...LOCATION '/user/hive/warehouse/salesorderdetail'" have failed. The table is created but no data is returned. Leave off "external" and don't specify any location and suddenly it works. Wtf?
CREATE TABLE IF NOT EXISTS default.salesorderdetail( SalesOrderID int, ProductID int, OrderQty int, LineTotal decimal ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' COLLECTION ITEMS TERMINATED BY '\002' MAP KEYS TERMINATED BY '\003' STORED AS TEXTFILE
Any insights are greatly appreciated.
UPDATE:Thanks for the help so far. Here's the exact syntax I'm using to attempt external table creation. (I've only changed the storage account name.) I don't see what I'm doing wrong.
drop table default.salesorderdetailx; CREATE EXTERNAL TABLE default.salesorderdetailx(SalesOrderID int, ProductID int, OrderQty int, LineTotal decimal) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' COLLECTION ITEMS TERMINATED BY '\002' MAP KEYS TERMINATED BY '\003' STORED AS TEXTFILE LOCATION 'wasb://[email protected]/mycn-1/hive/warehouse/salesorderdetailx'