I use the following code to export table from hive to hdfs in csv/tsv format.
INSERT OVERWRITE DIRECTORY '/user/xyz/dem_data/science_data'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n'
STORED AS TEXTFILE
SELECT *
FROM science_data;
When I view the copied file in hdfs I see a lot of characters like this
??=%??0nother episod?/aAj%?is ?a???$of J horse!de9?amA?se0(
I'm not sure what's going wrong. Do I need to have some kind of encoding to this to get clean text. The actual files have clean text.
select * from science_data limit 3- Gaurang Shah