1
votes

I have a single file of about 15G with 2 years of data. I tried to separate the two years into two different files with the following partial SAS codes. Say, the intermediate processed data is called A_20112012_xxx, and the separated ones are B_2011_xxx, B_2012_xxx.

%let column = years;

proc sql noprint; select distinct cat ("data B", &column., "xxx;" , "set A_20112012_xxx;" , "where &column=", &column., ";" , "run;" ) into :SplitSteps separated by ";" from A_20112012_xxx; quit; &SplitSteps;

Although there was no issue to separate them one by one, it always resulted in an error message of not enough space if doing the above way. The reason to do this is that there are files with more than 2 years, which need to be separated. I just want to automate the process.

What's the possible reason for the not enough space error?

1
Is there any particular reason why you aren't using views or by-group processing for this rather than splitting into separate datasets?user667489
Making a single file into separate datasets is a requirement. I tried to automate the process with a Macro for different files with different years. So in the end, the marco can do like %macro(libname = , data = , years = 20112012), then it splits the data for each year. And it also can work for 3 years, 4 years and etc. So no need to adjust for years manually. It will be great if there is a more efficient way to do it. Otherwise, so far, this is the way I know about.Fred
Why do you use SQL? Use data step! And you will have enough memory.crow16384

1 Answers

2
votes

Possibly the distinct operation is not occurring until all the rows have been processed with the cat. This would mean you have internal temp resource with large number of rows (per the 15G table) with wide cat result (with default length of 200 characters) prior to the removal of duplicates for the distinct operation.

Try moving the distinct selection into the from

%let column = years;
proc sql noprint;
    select
      cat ("data B", &column., "xxx;"
           ,  "set A_20112012_xxx;"
           ,  "where &column=", &column., ";"
           ,  "run;"
           )
    into :SplitSteps separated by ";"
    from (select distinct &column from A_20112012_xxx)
    ;
 quit;

 &SplitSteps;

Sometimes it is better to modularize for clarity

%macro data_fracker (data=, column=);
  %macro split_for (value);
    data B_&value._xxx;
      set &data;
      where &column = &value;
    run;
  %mend;
  %local dispatch;
  proc sql noprint;
    select cats('%split_for(', &column, ')')
    into :dispatch separated by ' '
    from (select distinct &column from &data)
    ;
  quit;
  &dispatch
%mend;

%data_fracker (data=sashelp.class, column=age)

Also,

  • What is the operating system and hardware ?
  • If Windows what is the drive space ?
    • What does wmic logicaldisk list brief report ?
  • What is the folder of the WORK libref ?
  • What is the actual log ?
  • Do any of the split steps get run ?