multiple missing types from SAS to SQL server

Question

I have a SAS dataset that I am converting to sql server 2005. In the sas dataset there are numeric variables that allow for multiple missing values.

For example column AGE is specified as

0-124 = number
. = missing
.A = Invalid
.B = Unavailable

Now I want to use a number for the sql server datatype and need to maintain the integrity of the data. My first thought is to create a reference table for the actual value from sas:

-1 = .A = Invalid
-2 = .B = Unavailable

This seems messy since age would require entry for each age 1-124. Curious if anyone else has run across this and has a better solution.

Matt Parker Matt Parker · Accepted Answer · 2011-01-24T22:49:52

I imagine that most of the time you're not going to care about why values are missing, just that they are missing - but by encoding it as negative integers, you'll have to be sure to filter those out every time you do anything with that variable. That's backwards: you should do something special to incorporate the kind-of-missingness information, not to run basic math and statistics.

Rather than trying to recreate in SQL Server a data type that it doesn't have, why not work with what it's got? I would probably just send the missing values as NULL to SQL Server and then encode the type of missingness in a second variable. When you actually need to know why data are missing, it's still there; but when you don't, SQL Server will handle the missing values in your variable using its default methods for dealing with them.

multiple missing types from SAS to SQL server

2 Answers