Reading (even joining) a very large (1.1bn row) table in Enterprise Guide from Teradata

Question

Hopefully you guys can help with what I'm hoping is quite a simple question for those in the know!

I live (well, work) in SAS Enterprise Guide and am trying to perform a simple left join against a table in Teradata.

The table is extremely large (700+ columns, 1.1bn rows) and so far I have been connecting via a LIBNAME statement at the top of my program, followed by the usual PROC SQL to read the data.

The issue I am having is its is extremely slow. I performed the join successfully using 90 rows on the left table and it took 3 hours to complete. The real table I want to use has something like 15,000 rows.

I have tried to connect via the SQL Pass-Through method, but this throws a hosts file error, which I can't fix due to corporate security limitations.

Has anyone had any experience performing this kind of task?

I should mention that I can run a simple select * query in Teradata SQL Assistant is just over 1 minute (16,666,666 obs/s!) so the limitation must be somewhere between SAS/Teradata, or even SAS itself.

I'm sorry I haven't posted actual code snippets as they're on my work machine but this has been bugging me for ages so thought I'd see if I'm missing any tricks.

Thanks in advance for your help.

I should mention, I'm only calling the required 4 columns from teradata so its not because I'm calling ~700 unnecessary columns! — iain
Are you allowed to create/ load tables on teradata? Most likley your problem is related to a full load of the table as the left join is done on the SAS side. — Jetzler
Are both tables in Teradata? Or are you trying to join a SAS table with a Teradata table? — Tom
No Tom, the left table is in SAS and the large right one in Teradata — iain
Jetzler, you're probably right. Was just posing the question incase I was missing something obvious. The joys of 'big data'! I'll ask the sysadmins if I'm allowed to load anything on! Thanks for your help — iain

John XXX John XXX · Accepted Answer · 2016-08-23T22:51:51

So you're joining a SAS data set to a Teradata table and want to return the matching records. You'll want to use SAS's DBMASTER= data set option. It designates which of the tables is larger. By telling SAS this, it knows which table to move.

Here I assume librefs have already been assigned and that the Teradata table is larger--more obs--than the SAS data set.

proc sql threads; select tdTable.* from sastables.sasTable1, td.tdTable(dbmaster=yes) where tdTable.idNum=sasTable1.idNum; quit;

If by chance your SAS data set is larger, you'll want to use the MULTI_DATASRC_OPT= option. Either google these terms or look in the SAS/Access to Relational Databases manual. It's pretty good.

Good luck.

Reading (even joining) a very large (1.1bn row) table in Enterprise Guide from Teradata

2 Answers