9
votes

The flow I have in mind in this:
1. Export a sas7bdat from SAS
2. Import that file in python with pd.read_sas and do some stuff on in
3. Export the pandas dataframe to sas7bdat (or some other SAS binary fileformat). I thought that pd.to_sas would exist, but it doesn't
4. Open the new file in SAS and do further stuff on it

Is there a solution to point 3 above? As I see it, my only options are csv or some SQL database.
This is not really a programming question. hope it won't be an issue.

2
python data could not be export to SAS data directly, it should be saved as csv/excel/sql data first, then read by SAS.Shenglin Chen
@ShenglinChen - you're right, spent the last hour or so reading github issues / proposals and now I'm thankful that at least read_sas exists. Interestingly enough R has some packages that can do this.BogdanC

2 Answers

8
votes

Python is capable of writing to SAS .xpt format (see for example the xport library), which is SAS's open file format. SAS7BDAT is a closed file format, and not intended to be read/written to by other languages; some have reverse engineered enough of it to read at least, but from what I've seen no good SAS7BDAT writer exists (R has haven, for example, which is the best one I've seen, but it still has issues and things it can't do).

More common than XPT files, though, which can be slow to work with, is to write a CSV and then write a SAS input script in your python/etc. program. That allows you to use variable labels, value labels, types, etc., as you wish very easily; and writing a SAS input script is very easy to do. Many other software packages do this for their preferred method to produce SAS files. This has an additional advantage that it is easily cross-platform - doesn't matter if your SAS program is on a mainframe, UNIX, Windows, etc.; it's all the same.

Edit: If you do have SAS licensed locally, either via a server or local install, another option for exporting Python data to SAS is SASPy, which is a SAS-maintained open source project that allows Python to directly connect to SAS instances and directly send data. (Under the hood, I believe the data is actually transmitted as a CSV most of the time, and then read in using SAS code.) The SAS ODBC driver is also an option, but for Python SASPy will be the easiest option most likely.

1
votes

"SAS7BDAT is a closed file format, and not intended to be read/written to by other languages; some have reverse engineered enough of it to read at least, but from what I've seen no good SAS7BDAT writer exists."

Although the SAS7BDAT is a proprietary format, it is not closed. It can be read and written by third-party products using SAS' own ODBC drivers. https://support.sas.com/en/software/sas-odbc-drivers.html. Since Python can use ODBC (pyodbc), just use the SAS ODBC Driver to write the SAS7BDAT file format.

IBM SPSS Statistics and IBM SPSS Modeler can also read and write the SAS7BDAT format as well as the earlier pre-version 7 formats and the SAS Transport File format (the .xpt) files noted above. These products do not require ODBC to do this and this capability is included in SPSS Statistics Base via the SAVE Translate command. It is included in SPSS Modeler Professional via the SAS Source node for reading and the SAS Export node for writing.