0
votes

I have 8 different tables with 24 million to 40 million records each. One of these tables is the master table that is used to join to the other 7.

My question is that while working with such large data sets, is it viable to use hash merge? I tried a hashing technique I learnt online but my system ran out of memory while loading the master table itself.

Are there any other efficient methods for merging large data sets in SAS?

Also, could anyone please help me with a snippet for merging these tables together. They're all merged with the master table based on different attributes.

Note : There are many to one merges in each scenario

1
This question is ripe for being closed. Please add information about your system resources, the data set (table) structures, and code written demonstrating the problem. The number of columns can be a resource hog factor if unused excess information is being stored in hashes. Some use cases have the master table being iterated with SET and the other tables loaded into hash. Were attempts at using MERGE statement or SQL join made and deemed insufficient solutions ?Richard
Based on what you are saying, you will be best off with a multi-part join that can be a mix of hash tables, merge statements, or SQL joins depending on the requirements.Stu Sztukowski

1 Answers

0
votes

Create index on these data. Or divide the master table into smaller pieces, doing the Proc SQL for every single pieces then union them.