We have six hive tables with sample (example) structure like (where each table has millions of merchant records)
Table1
MerchntId ,field1, field2
Table2
MerchantId, field3,field4
Table3
MerchantId, field5,field6,field7
Table4
MerchantId, field8,field9,field10
Table5
MerchantId, field11, field12, field13
and so on
Requirement is to create a horizantal layout to take all unique merchants where at least one field has value for a merchantId A merchantId may or may not present in other tables.(for a merchant there may be records in other tables or may not be there)
Final Table
MerchntId, field1, field2, field3, field4,
field5,field6,field7,field8,field9,field10,field11, field12, field13
output should be like after joining
i) 101 abc def ghi
ii) 102 ghj fert hyu ioj khhh jjh ddd aas fff kkk fff vvv ff
for case (i) only three fields have values for case (ii) all fields have values
For this we are doing FULL OUTER JOIN on merchantId for two tables and so on and then creating the final table
Is there any better approach doing this ?
for eg. my current approach
SELECT distinct
(case when a.MerchntId IS NOT NULL then a.MerchntId else (case when
b.MerchntId IS NOT NULL
then b.MerchntId else '' end ) end ) as MerchntId,
(case when a.field1 IS NOT NULL then a.field1 else '' end ) as field1,
(case when a.field2 IS NOT NULL then a.field2 else '' end ) as field2,
(case when b.field3 IS NOT NULL then b.field3 else '' end ) as field3,
(case when b.field4 IS NOT NULL then b.field4 else '' end ) as field4
from Table1 a
full outer join Table2 b
ON a.MerchntId = c.MerchntId;
full outer join of table 3 and table 4 and then full outer join of these two tables to create a final table