11
votes

I want to convert a .sas7bdat file to a .csv/txt format so that I can upload it into a hive table. I'm receiving the .sas7bdat file from an outside server and do not have SAS on my machine.

5
What have you done so far?matsjoyce
It's very difficult to retrieve the data from a sas7bdat file without having SAS installed on your machine. Can you get the data in a different format, or transfer it to a computer or server that does have SAS installed?mjsqu
This isn't possible without a tool of some sort. SAS7BDAT is a closed format, and only a few people have reverse engineered it.Joe

5 Answers

10
votes

Use one of the R foreign packages to read the file and then convert to CSV with that tool.

http://cran.r-project.org/doc/manuals/R-data.pdf Pg 12

Using the SAS7BDAT package instead. It appears to ignore custom formatted, reading the underlying data.

In SAS:

proc format;
value agegrp
   low - 12 = 'Pre Teen'
   13 -15 = 'Teen'
   16 - high = 'Driver';
run;

libname test 'Z:\Consulting\SAS Programs';

data test.class;
set sashelp.class;
age2=age;
format age2 agegrp.;
run;

In R:

 install.packages(sas7bdat)
 library(sas7bdat)
 x<-read.sas7bdat("class.sas7bdat", debug=TRUE)
 x  
7
votes

The python package sas7bdat, available here, includes a library for reading sas7bdat files:

from sas7bdat import SAS7BDAT
with SAS7BDAT('foo.sas7bdat') as f:
    for row in f:
        print row

and a command-line program requiring no programming

$ sas7bdat_to_csv in.sas7bdat out.csv
4
votes

I recently wrote this package that allows you convert sas7bdat to csv using Hadoop/Spark. It's able to split giant sas7bdat file thus achieving high parallelism. The parsing also uses parso as suggested by @Ashpreet

https://github.com/saurfang/spark-sas7bdat

2
votes

If this is a one-off, you can download the SAS system viewer for free from here (after registering for an account, which is also free):

http://support.sas.com/downloads/package.htm?pid=176

You can then open the sas dataset using the viewer and save it as a csv file. There is no CLI as far as I can tell, but if you really wanted to you could probably write an autohotkey script or similar to convert SAS datasets to csv.

It is also possible to use the SAS provider for OLE DB to read SAS datasets without actually having SAS installed, and that's available here:

http://support.sas.com/downloads/browse.htm?fil=0&cat=64

However, this is rather complicated - some documentation is available here if you want to get an idea:

http://support.sas.com/documentation/cdl/en/oledbpr/59558/PDF/default/oledbpr.pdf

2
votes

Thanks for your help. I ended us using the parso utility in java and it worked like a charm. The utility returns the rows as object arrays which i wrote into a text file.

I referred to the utility from: http://lifescience.opensource.epam.com/parso.html