and sorry for my bad english
I have to optimize a script, which is written in this way
1) A proc sql to retrieve data
proc sql;
Create table table2 as
Select A.Id
,A.UsefulData1
,B.UsefulData2
,...
,B.UsefulDataN
FROM Table1 as A
LEFT JOIN TableParam1 AS B
ON A.KeyA = B.KeyA
AND A.KeyB = B.KeyB
/* here : several left join */
LEFT JOIN TableParam2 AS N
ON A.KeyA = N.KeyA
AND A.KeyB = N.KeyB
;
quit;
2) Then a data step, to compute lot of data
data table3;
set table2;
/* here : lot of complex computes depending on useful data of the previous step */
keep id, computedData1, ..., computedDataN
run;
/* then add index on the id for later join */
/* if not debug, delete the proc sql table */
And then, it repeats the same logic of "proc sql / data step", where the new proc sql use the computed data of the previous data step. In the end I have severale couple of "proc SQL table / data step table".
It's very clear and easy to understand, but i find it not very efficient as the table resulting of the proc sql is only a "temporary table", only useful for the data step. So, i write a temporary table, which is quit costly in terme of time as they are big tables (> 10 millions lines)
In my mind, the most efficient thing would be to keep the "proc sql table" in memory, and then only ouput the "data step table". I would save a huge time of writing.
Is there a way to do such a thing? Or do you see another most efficient method ?
Thanks for your help!