Hive tables contains the structured data within files located in the HDFS folder which was given in the Hive table creation command.
With Cygnus 0.1, such structured data is achieved by using CSV-like files, thus adding a new file to the HDFS folder or appending new data to an already existent file within that folder is as easy as composing new CSV-like lines of data. The separator character must be the same you specified when creating the table, e.g.:
create external table <table_name> (recvTimeTs bigint, recvTime string, entityId string, entityType string, attrName string, attrType string, attrValue string) row format delimited fields terminated by '|' location '/user/<myusername>/<mydataset>/';
Thus, being the example separator |
, the new data lines must be like:
<ts>|<ts_ms>|<entity_name>|<entity_type>|<attribute_name>|<attribute_type>|<value>
From Cugnus 0.2 (inclusive), the structured data is achieved by using Json-like files. In this case you do not have to deal with separators, nor table creation (see this question), since Json does not use separators and the table creation is automatic. In this case, you have to compose a new file or new data to be appended to an already existing file by following any of this formats (depending if you are storing the data in row
or column
mode, respectively):
{"recvTimeTs":"13453464536", "recvTime":"2014-02-27T14:46:21", "entityId":"Room1", "entityType":"Room", "attrName":"temperature", "attrType":"centigrade", "attrValue":"26.5", "attrMd":[{name:ID, type:string, value:ground}]}
{"recvTime":"2014-02-27T14:46:21", "temperature":"26.5", "temperature_md":[{"name":"ID", "type":"string", "value":"ground"}]}
It is worth mentioning there exists scripts in charge of moving 0.1-like format into 0.2-like (or higher) format.