0
votes

I'm executing SAS data step with by variable. I understand the output when the data is sorted by key (X in my case). However, when the data is unsorted, I get the following output:

SAS Output

I'm using SAS ODA's AFRICA dataset from MAPS library which has 52824 rows. Here's the link to the CSV file.

data  AFRICA_NEW12;
set Maps.AFRICA;
by X;
firstX = FIRST.X;
lastX = LAST.X;
run;

I don't understand how rows are selected when data is not sorted. Why does the output have 14 rows?

1
Your code does nothing that would filter your dataset. Whatever is happening here is due to a previous step or something not shown.Reeza
Check your log. If you use BY with an unsorted data set you'll get an error generated and the step will truncate once the first record that breaks the by record is encountered.Reeza

1 Answers

1
votes

You have an error in your log because you didn't sort it. Make sure to read your log.

This likely generates the same issue for you:

data cars;
set sashelp.cars;
by model;
run;

proc print data=cars;
var make model origin;
run;

Output is:

Obs Make    Model   Origin
1   Acura   MDX Asia
2   Acura   RSX Type S 2dr  Asia

And the log shows:

 ERROR: BY variables are not properly sorted on data set SASHELP.CARS.
 Make=Acura Model=TSX 4dr Type=Sedan Origin=Asia DriveTrain=Front MSRP=$26,990 Invoice=$24,647 EngineSize=2.4 Cylinders=4
 Horsepower=200 MPG_City=22 MPG_Highway=29 Weight=3230 Wheelbase=105 Length=183 FIRST.Model=1 LAST.Model=1 _ERROR_=1 _N_=3
 NOTE: The SAS System stopped processing this step because of errors.
 NOTE: There were 4 observations read from the data set SASHELP.CARS.
 WARNING: The data set WORK.CARS may be incomplete.  When this step was stopped there were 2 observations and 15 variables.
 WARNING: Data set WORK.CARS was not replaced because this step was stopped.

Note this portion specifically:

WARNING: The data set WORK.CARS may be incomplete. When this step was stopped there were 2 observations and 15 variables.

If you know the data is sorted in the order you want, which may not be the same as what SAS expects you can add the notsorted option on the BY statement but this is a different type of functionality so check your code thoroughly.

data cars;
set sashelp.cars;
by model notsorted;
run;