0
votes

I have this proc sql step in SAS that takes about 1.7sec to complete, however, if i run this SQL via SQL server or any sql client, it takes only 0.04 sec to complete.

From the sastrace log i found that most of the time are spend on 'Total row fetch seconds', which doesnt really explain what cause the 3600% different between SAS Proc SQL and SQL client.

Summary Statistics for ODBC are:
Total row fetch seconds were:                       1.736099
Total SQL execution seconds were:                   0.000698
Total SQL prepare seconds were:                     0.000168
Total SQL describe seconds were:                    0.003049
Total seconds used by the ODBC ACCESS engine were     1.743318

What exactly is the 'Total row fetch seconds' and how can i improve the run time?

proc sql noprint;
    create table work.out1 as
    select a.search_key,
            a.field1,
            c.field2
    from db.tabl_1 as a
            join
                    db.tabl_2 as b
                    on a.search_key = b.search_key
                    and a.search_key in (&search_key)
            join
                    db.Tabl_3 as c
                    on b.field2 = c.field2
                    and upcase(XXX_field) EQ "XXX";
quit;
&search_key = 19709, 19710, 19711, 19712, .........

Edit: to include Details log from sastrace

MPRINT(macro):   proc sql noprint;
SYMBOLGEN:  Macro variable SUBMISSION_STUDY_RUNS resolves to 
            19709, 19710, 19711, 19712, .........
MPRINT(macro):   create table work.out1 as select a.search_key, a.field1, c.field2 from db.tabl_1 as a join db.tabl_2 as b on 
a.search_key = b.search_key and a.search_key in 
(19709, 19710, 19711, 19712, .........) join db.Tabl_3 as c on b.field2 = c.field2 
and upcase(XXX_field) EQ "XXX";
SQLSRV: AUTOCOMMIT is NO for connection 11
SQLSRV: AUTOCOMMIT turned ON for connection id 11


SQLSRV_10181: Prepared: on connection 11
SELECT * FROM "db"."tabl_1"

SQLSRV: AUTOCOMMIT is NO for connection 12
SQLSRV: AUTOCOMMIT turned ON for connection id 12

SQLSRV_10182: Prepared: on connection 12
SELECT * FROM "db"."tabl_2"

SQLSRV: AUTOCOMMIT is NO for connection 13
SQLSRV: AUTOCOMMIT turned ON for connection id 13

SQLSRV_10183: Prepared: on connection 13
SELECT * FROM "db"."Tabl_3"

SQLSRV: AUTOCOMMIT is NO for connection 14
SQLSRV: AUTOCOMMIT turned ON for connection id 14

SQLSRV_10184: Prepared: on connection 14
 select a."search_key", a."field1", c."field2" from "db"."tabl_1" a inner join "db"."tabl_2" b on a."search_key" = b."search_key" inner join 
"db"."Tabl_3" c on b."field2" = c."field2" where ( a."search_key" in (19709, 19710, 19711, 19712, .........) ) and ({fn 
UCASE(c."XXX_field")} = 'XXX')

Query i ran in the SQL client:

select a.search_key,
a.field1,
c.field2
from db.tabl_1 as a
join
       db.tabl_2 as b
       on a.search_key = b.search_key
       and a.search_key in (19709, 19710, 19711, 19712, .........)
join
       db.Tabl_3 as c
       on b.field2 = c.field2
       and XXX_field = 'XXX';
3
What is the actual SQL produced, and how are you running that -- just to be sure you're comparing apples to apples? The code in the above is not T-SQL; it'll be preprocessed by SAS to produce an actual query. Notably, T-SQL has no upcase, EQ or an in supporting arbitrary parameterized lists. - Jeroen Mostert
Is db.tabl_1 an ODBC libname? Consider using ODBC passthrough rather than ODBC libnames if you need more control over the actual statement being run. - Robert Penridge
@JeroenMostert Not sure if it helps, but i have included the detail from the sas trace log. I had to make some changes on the table name and field name because of company policy - user3646699
@RobertPenridge Yes, that is a ODBC table, it is a physcial table on the SQL server - user3646699
There's at least one obvious difference between your query and what SAS sends: you have XXX_field = 'XXX' while SAS uses {fn UCASE(c."XXX_field")} = 'XXX'. The latter expression is not amenable to being satisfied by an index on XXX_field. The other difference is the shifting of the clauses of the JOIN to the WHERE -- this should not result in a different execution plan, but nevertheless might. You should verify the running time/execution plan of the actual query submitted by SAS. - Jeroen Mostert

3 Answers

2
votes

This is a bit long for a comment.

You are comparing two different things -- two completely different databases. Presumably, they are running on different hardware and in different environments. The first assumption is that the data is also different.

Second, if you want to understand performance, then you need to learn about explain. Unfortunately, this is not a supported feature of proc sql (although it is available.

Given the limited information, I would guess that the tables in SQL Server have appropriate indexes built on them, but that the tables in SAS do not. You can probably add the appropriate indexes and get closer performance numbers for the two systems. However, that is just speculation.

2
votes

Because you used UPCASE() in one query and not in the other you are not actually running the same query. Remove the UPCASE() function from your SAS code.

0
votes

It seems like when SAS run proc sql, especially join, it will copy the whole table into SAS memory before doing the join/processing. The same query on SQL server would just join and return the rows that fit the join.

I have replaced all the query with SAS Data step, and the performance is now closer to what i get on the SQL server.

Please let me know if anyone have another work around/explanation