5
votes

I'd like to use the SAS libname JSON engine instead of PROC GROOVY to import the JSON file I get from the Twitter API. I am running SAS 9.4M4 on OpenSuse LEAP 42.3.

I followed Falko Schulz's description in how to access the Twitter API and everthing worked out fine. Up to the point at which I wanted to import the JSON file into SAS. So the last working line of code is:

proc http method="get"
out=res headerin=hdrin
url="https://api.twitter.com/1.1/search/tweets.json?q=&TWEET_QUERY.%nrstr(&)count=1"
ct="application/x-www-form-urlencoded;charset=UTF-8";
run;

which yields a json-file in the file referenced with the filename "res".

Falko Schulz uses PROC GROOVY. In SAS 9.4M4, however, there is this mysterious JSON libname engine that makes life easier. And it works for simple JSON files. But not for the Twitter data. So having the JSON data from Twitter downloaded, using

libname test JSON filref=res;

gives me the following error:

Invalid JSON in input near line 1 column 751: Some code points did not transcode.

I suspected that something is wrong with the encoding of the files so I used a filename statement of the form:

filename res TEMP encoding="utf-8";

without luck...

I also tried to increase the record length

filename res TEMP encoding="utf-8" lrecl=1000000;

and played around with the record format... to no avail...

Can somebody help? What am I missing? How can I use the JSON engine in a LIBNAME statement without running into this error?

1
What encoding is your SAS session running in? IE, what does this return: proc options option=encoding; run;Joe
ENCODING=LATIN9, I should probably change that to UTF-8Johannes Bleher
Yes, there's a good chance that's at least part of your issue. Most SAS installations 9.4+ automatically include a UTF-8 startup option also (it's probably a separate shortcut in the start menu/etc.)Joe
Thanks. This fixed it! Sorry for bothering!Johannes Bleher
Unfortunately a common issue in SAS thanks to SAS's complexity in dealing with encoding; probably a useful question for others in the future!Joe

1 Answers

3
votes

Run your SAS session in UTF-8 mode, if you're inputting UTF-8 files into SAS datasets. While it's possible to run SAS in another mode and still read UTF-8 encoded files to some extent, you will generally have a lot of difficulties.

You can tell what encoding your session is in with this code:

proc options option=encoding;
run;

If it returns this:

 ENCODING=WLATIN1  Specifies the default character-set encoding for the SAS session.

Then you're not in UTF-8 encoding.

SAS 9.4 and later on the desktop are typically installed with UTF-8 option automatically selected in addition to the default WLATIN1 (when installed in English, anyway). You can find it in the start menu under SAS 9.4 (Unicode Support), or by using the sasv9.cfg file in the 9.4\nls\u8\ subfolder of your SAS Foundation folder. Other earlier versions may also have that subfolder/language installed, but it was not always default to have it installed.