0
votes

I am importing an Excel spreadsheet into SAS using Proc Import:

Proc Import out=OUTPUT
Datafile = "(filename)"
DBMS=XLSX Replace;
Range = "Sheet1$A:Z";
run;

My numeric data columns contain a mixture of values held in Excel as numerics and '0 values held as text - i.e. with a leading apostrophe / single quote. When SAS imports these it treats them all the same (i.e. it returns Character strings of the values with the leading apostrophe stripped out).

This results in differences from the spreadsheet when calculations are applied (e.g. averaging) as Excel treats the '0 values as missing but SAS treats them as 0.

Is it possible to import the values as strings including the leading single quote / apostrophe, so that I can replace the '0 with missing values but keep the 0 records as 0? I would like to avoid having to manually manipulate the data in Excel as this data is drawn from an external source (don't ask...)

2
Can you get them to replace the string zero with something that is not a valid number? Like a string space or string period.Tom
That's pretty much what I've had to do. The Excel add-in that extracts the data from the source into Excel (as I said, don't ask...) has an option to configure how missing values are returned. This was set to 0 by default and I have now set to . {period} in my extracts, resolving the issue.Brisbane Pom

2 Answers

0
votes

I doubt it. I think Excel doesn’t really consider the leading apostrophe as part of the value. It’s just a crazy way to indicate that a value is a text string (rather than numeric). When SAS imports the data, it recognizes that the quote is not part of the value. So if you’ve got an Excel column with ‘0 in some cells and 0 in others, it’s going to come in as character, and I don’t think you can tell the difference between them.

Unfortunately, the xlsx engine doesn’t support the s DBSASTYPE option. Other engines that import Excel have the DBSASTYPE option. That should allow you to tell SAS to import a column as a numeric variable, even if it sees character values. If it’s the case that you want all text values in the cell converted to missing, that might do the trick. But it’s possible it would still treat ‘0 the same as 0. I’m away from SAS, so can’t test.

0
votes

Option:

The ~ (tilde) format modifier enables you to read and retain single quotation marks.

http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a003209907.htm

Is it possible to convert the .xlsx to .txt keeping the single quotes? Because it is not possible to infile xlsx in a data step.

 filename df disk 'C:\data_temp\ex.txt';

  data test;
    infile df firstobs=2;
    input ID $2. x ~$3. ;
  run;

  proc print data=test;
  run;

enter image description here