0
votes

While using Pig for ETL, I'm putting processed data in Hive using the HCatStorer:

STORE dataprocessed INTO 'database.table' USING org.apache.hcatalog.pig.HCatStorer();

My goal is to make the data of the destination table usable either from Pig or from Hive (depending on the skills of the user)

What is the recommended format to store datetime?

I care of:

  • Storing timezone info
  • Being able to compare dates
  • Being human readable (as an example, I don't believe timestamp are human readable)

Thanks for the help

1

1 Answers

1
votes

I would probably store date/time related information as ISO-8601 formatted strings/chararrays as HCatStorer does not support Date types (at least in pig) directly.

There are functions to convert date/time information in pig: http://pig.apache.org/docs/r0.13.0/func.html#datetime-functions