I have below 3 hive tables with same structure.
```
drop table default.test1;
CREATE EXTERNAL TABLE default.test1(c1 string,c2 string,c3 string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
's3://s3_bucket/dev/dev/testspark/test1/';
drop table default.test2;
CREATE EXTERNAL TABLE default.test2(c1 string,c2 string,c3 string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
's3://s3_bucket/dev/dev/testspark/test2/';
drop table default.test3;
CREATE EXTERNAL TABLE default.test3(c1 string,c2 string,c3 string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
's3://s3_bucket/dev/dev/testspark/test3/';
hive>insert into default.test1 values("a","b","c"); hive>insert into default.test2 values("d","e","f"); hive>insert overwrite table default.test3 select * from default.test1 UNION ALL select * from default.test2;```
Once after I loaded data by using UNION ALL of test1 and test2. test3 table s3 path is having the data in sub folders like below.
PRE 1/
PRE 2/
When I query the test3 table from hive it will give the result of the data which was inserted. But when I query the same in spark. It is getting zero count. pyspark shell:
sqlContext.sql("select * from default.test3").count() 0
How to fix this issue ?