i am attempting (unsuccessfully to create a parquet hive table on s3).
create external table sequencefile_s3
(user_id bigint,
creation_dt string
)
stored as sequencefile location 's3a://bucket/sequencefile';
Sequence file works perfectly.
create external table parquet_s3
(user_id bigint,
creation_dt string)
stored as parquet location 's3a://bucket/parquet';
insert into parquet_s3
select * from hdfs_data;
parquet does not work. The files are created on the S3 bucket/folder, select count(*) works, however select * from parquet_s3 limit 10 does not work.
other notes I am running a cloudera distribution 5.8 outside AWS or EC2. the S3a is properly configured (i can copy files though distcp and the s3 sequencefile and textfile external tables work perfectly).