What would be the storage and performance implication if we have multiple columns with same value for all the rows in a huge hive table that has underlining file format of ORC or Parquet storage format.
Lets say I have parquet hive table with column 5 and column 8 always having "HELLO" as the value.
- How does the file get stored with respect to ORC and Parquet in this scenario.
- Having duplicated column data , does it have any performance impact on the queries used later on this table.