0
votes

I have a column that holds both char and numeric variables. Example of the contents in this column are

T
M
12
3112

I would like to perform a proc frequency (or use any other function)on this column to read both the char and numeric values. Using proc freq ignores the char variables and only performs the action the numeric values. Thanks

2
Is this in Excel, by chance?Joe
Your description of the behavior of PROC FREQ is wrong. Can you post example data? If the variable has both numbers and characters then it is a character variable.Tom
A variable can't be character and numeric at the same time, if even a single entry is character then the variable is character for which proc freq should fine.G.Arima

2 Answers

1
votes

I agree with @Tom that posting code more conforms with the guidelines of SO.

@Tboy, your question is not really about proc freq but about SAS datasets and how they are created in the data step. proc freq just reads what is given to it. The SAS dataset is SAS datasets are composed of variables (analogous to columns) and observations (analogous to rows). A variable can be either character or numeric but not both. There is no variant datatype such as in VB. A character variable can have numbers in them but they are treated as text, often left justified, and cannot be added.

In the data step, SAS variables do not have to be explicitly declared as in other languages -- like VB. As soon as they are referenced they are created. The context by which they are referenced determines whether or not they are character or numeric.

/* This creates a numeric variable */
data MyDataset;
    x = 1;

/* This creates a character variable */
data MyDataset;
    x = '1';

If you try to assign two different datatypes to the same variable, SAS gets confused but in it's well documented way it keeps on chugging along. Without a code snippet, What probably happened to you was something like below where SAS coughs at values T and M but brushes itself off and carries on with dignity but the output dataset has observations 2 and 3 set to missing.

data MyDataset;
    x = 12; output;
    x = 'T'; output;
    x = 'M'; output;
    x = 12; output;
    x = 3112; output;
    run;

The log will contain the message:

NOTE: Character values have been converted to numeric at the places given by: (Line):(Column) 9 14:9
NOTE: Invalid numeric data, 'T' , at line 13 column 9. NOTE: Invalid numeric data, 'M' , at line 14 column 9. x=3112 ERROR=1 N=1 NOTE: The data set WORK.MYDATASET has 5 observations and 1 variables.

SAS gives you the option to get around this situation by explicitly declaring a variable using the length statement.

data MyDataset;
    length x $ 4 y 8;
    x = '1234'; y = 1; output;
    x = 'ABCD'; y = 2; output;

One can see whether a variable is character or numeric by performing a proc contents on one's SAS dataset:

proc contents data=MyDataset;

which creates the following output which I have edited for brevity:

The CONTENTS Procedure

Data Set Name        WORK.MYDATASET           Observations          2 
Member Type          DATA                     Variables             2 
Engine               V9                       Indexes               0 

Alphabetic List of Variables and Attributes
#    Variable    Type    Len
1    x           Char      4
2    y           Num       8

The answers to your question could also be answered by reading introductory texts and documentation on the SAS system.

0
votes

Here is one little wrinkle on PROC FREQ that might explain why you are having trouble. By default PROC FREQ will ignore missing values. The two non-digit values in your example are both single letters so your numeric variable might be using special missing values.

If so you can add the MISSING or MISSPRINT option to the TABLES statement.

missing t m ;
data test ;
  input @1 text $4. @1 numb 4. ;
cards;
T
M
12
3112
;

proc freq ;
  tables text numb ;
  tables numb / missing ;
run;