0
votes

As an example consider I have a data of all the major sports events happened.Schema given below

EventName,Date,Month,Year,City

This data that is physically structured in HDFS on year,date,month.

Now I want to create virtual partitions on that based on some other column value, eg. City.The data will be stored physically in HDFS in year,date,month structure only but my metadata keeps track of the virtual partition.

Can hive metastore do it for me?

1

1 Answers

0
votes

I don't think so it will happen. Actually partitioning in Hive means creates different dir for different partition. And metastore only contains metadata of table. It won't control the actual data. Technically when ever we query based on that partitioned column in Hive table, the query will execute on that exact partitioned dir only. So virtual partitioning with out changing hdfs structure in the sense the real data will be in one dir so the query has to be execute on entire data. So technically optimisation is not at all happening.