My client uses SAS 9.3 running on an AIX (IBM Unix) server. The client interface is SAS Enterprise Guide 5.1.
I ran into this really puzzling problem: when using PROC IMPORT
in combination with dbms=xlsx
, it seems impossible to filter rows based on the value of a character variable (at least, when we look for an exact match).
With an .xls file, the following import works perfectly well; the expected subset of rows is written to myTable
:
proc import out = myTable(where=(myString EQ "ABC"))
datafile ="myfile.xls"
dbms = xls replace;
run;
However, using the same data but this time in an .xlsx file, an empty dataset is created (having the right number of variables and adequate column types).
proc import out = myTable(where=(myString EQ "ABC"))
datafile ="myfile.xlsx"
dbms = xlsx replace;
run;
Moreover, if we exclude the where
from the PROC IMPORT
, the data is seemingly imported correctly. However, filtering is still not possible. For instance, this will create an empty dataset:
data myFilteredTable;
set myTable;
where myString EQ "ABC";
run;
The following will work, but is obviously not satisfactory:
data myFilteredTable;
set myTable;
where myString LIKE "ABC%";
run;
Also note that:
- Using
compress
or other string functions does not help - Filtering using numerical columns works fine for both xls and xlsx files.
- My preferred method to read spreadsheets is to use excel libnames, but this is technically not possible at this time.
I wonder if this is a known issue, I couldn't find anything about it so far. Any help appreciated.
myString
. Have you checked to verify that isn't the case? (put myString $HEX.;
would be a good start) – Joewhere mystring eq "ABC"
fails all rows andwhere mystring like "ABC%"
doesn't. – Joedbms=xls
, blanks (difference between number of characters and the variable length) are filled with hex20
s (spaces), as they should. Withdbms=xlsx
, it's rather00
s (NULs). Not sure yet how we'll handle this, but at least we're a step closer... Thanks for your help Joe. – Dominic Comtois