If we are loading data from HDFS into Hive tables, what is the advantage over loading data from local file? If we load data from HFDS into Hive, isn't it data replication in HDFS?
1 Answers
0
votes
Local to HDFS will be slower as single huge chunk of data will trancefer form local to remote n number of nodes.
There will be replication of data if you copy HDFS file into hive tables and thats default functionality as Hive manage its own directory, if you dont want the duplication of data please check this answer : Is it possible to import data into Hive table without copying the data