1
votes

I need to create reports/summary tables on Redshift using SAS. My client data is on Amazon Redshift and he provided me all credentials to access the database. I have SAS 9.2 (32bit) and downloaded PostgresSQL 32bit driver to my system (as Redshift is based on PostgresSQL). I setup ODBC data source successfully and now I am connecting SAS using below command:

LIBNAME RdSft ODBC DSN='Redshift server' user='xxxxxxx' pw='xxxxxx';

data Rdsft.new_table;
 set Rdsft.old_table(obs=10);
run;

I am able to connect and can see contents of tables on Redshift but not able to make any table there. Sometimes I could but its taking hours to create a table just with 10 observations. Someone suggested me to use DbVisulizer to do this task but I am comfortable with SAS only.

Please suggest.

2
SAS 9.0 is almost 12 years old ... most likely an issue with invalid license? - Reeza
Its my mistake, we have SAS 9.2 with valid license. Is it just because of SAS version? - XLers
Is all the data you're manipulating on the server or is some local? I assume you don't want to use explicit pass through to use PostGres SQL explicitly? - Reeza
all data is available on server. its more than 5 TB data. - XLers
I'd be worried about some speed issues being inherent here - can you connect outside of SAS just to test if the speed issues are inherent to the connection? - Joe

2 Answers

0
votes

If you have SAS/ACCESS try using the postgres engine for the library instead of going via ODBC eg:

libname RdSft postgres server="<server-address>" database=<db-name> port=5432 user='xxxxxxx' pw='xxxxxx';

Also, try adding conopts="UseServerSidePrepare=1" to the libname as suggested by this article: http://support.sas.com/kb/52/585.html

0
votes

The simple fact of the matter, is that when you're connecting to Redshift via ODBC, even your simple data step query:

"data Rdsft.new_table; set Rdsft.old_table(obs=10); run;"

Is essentially translating to "select * from rdsft.old_table" before the obs subset is getting applied.

The SAS/ACCESS postgres solution is solid, you may also want to use proc sql, select only the columns you want, and subset as much as possible. Proc Sql will translate a bit easier into Redshift query language through an ODBC than the data step will.

SAS will hopefully be issuing a SAS/ACCESS for REDSHIFT option sometime soon! :)