1
votes

Assume our project has hundreds of SAS scripts calling one-another. It also includes an humongous amount of SAS formats, mainly for columns which have certain traits, e.g. a format chip_col which replaces 1 with red, 2 displays as green.

Assume I opened a given table using the SAS-explorer and I now see a table with columns with numbers rather than colors as text. In this case, I would right-click on the column-header to open the column properties and browse for the most appropriate format.

My problem is then twofold: Firstly, the SAS-popup displaying all available choices of (custom-defined) formats takes forever when envoking the pull-down menu with all possible formats. Secondly, I do not see the actual format definition, just the format name abbreviation which is not too helpful in my case.

My question: How can I access the complete format information programmatically from within SAS?

3
Didn't your project publish any data dictionaries for the users of the data? You can use PROC FORMAT to convert your format catalog(s) into datasets. Do you have all of the formats in one catalog or do you have multiple catalogs? Aren't the formats already attached to the variables in the datasets? Why aren't they being used when you view the data?Tom
What SAS client are you using ? EG ? Display manager ? Studio ?Richard
@Richard We are using rhe good ol' Display Manager with log window, output window, editor, and explorer window.B--rian
@Tom I am not too fluent with SAS terminology and in particular how data dictionary or catalogue is defined in the SAS world. Full text search of the code base is my current time consuming approach. I figured out that there are multiple scripts adding formats, even conditionally, if that's what you are asking.B--rian
I used data dictionary in a generic sense. Some document produced by your computer project to help everyone understand how the data works in the project. In addition to SAS datasets you can create SAS catalogs. There are many types of objects that SAS can store in a catalog, one of which is a format definition. So when you run PROC FORMAT code to define formats they are stored in format catalogs. PS You create SAS programs, not SAS scripts.Tom

3 Answers

1
votes

You can get a complete list of formats (and other things like datasets) using the dictionary tables shown by proc sql. Try this:

proc format library=work;
    value sex 0 = "F"
              1 = "M";
run;

proc sql;
    create table blah as
        select *
            from dictionary.formats
            where libname = "WORK"
        ;
quit;

If you drop the where clause from the SQL statement above it will list out all the formats with many details.

You can use proc format to show you the format in a catalog like this:

* add your library where I have "WORK";
proc format library = WORK FMTLIB CNTLOUT = work.x;
run;

You can save the information into a table using the CNTLOUT.

1
votes

VIEWTABLE has the feature to raise a dialog to define Column Attributes... for a column being displayed in the VIEWTABLE window. The dialog is raised from the column header context menu, or by double clicking the column header.

If you choose the Format: field ellipsis another dialog is raised for selecting the format from a selection dialog. As part of populating the selection dialog the internals of VIEWTABLE generates SAS code and includes it with option /SOURCE2. This can cause a very large amount of LOG window output when there are many custom formats, and that can be a cause of the takes forever experience. The /SOURCE2 can not be eliminated due to it being part of viewtable internal codegen.

Viewtable internal codegen that I presume is used to populate format selector:

40218  proc delete data=work._qwffmt;
40219  run;
40220  proc sql;
40221  create table work._qwffmt as select fmtname,libname,memname from dictionary.formats where
40221! source='C' and fmttype='F' order by libname,memname;
40222  quit;
40223  filename _sascode temp;
40224  data _null_;
40225  file _sascode;
40226  put 'data work._fmtdesc;';
40227  put '     length fmtname $32 label $3 type $1 start $16 end $16 min max default 3;';
40228  put '     stop;';
40229  put '     fmtname=" "; label=" "; type=" "; min=0; max=0; start=" "; end=" "; default=0;';
40230  put '     run;';
40231  run;
40232  data _null_;
40233  set work._qwffmt;
40234  by libname memname;
40235  file _sascode mod;
40236  if first.memname then do;
40237  length libmem $41;
40238  libmem=cats(libname,'.',memname);
40239  put 'proc format lib=' libmem 'cntlout=_temp(keep=fmtname start end label min max default
40239! type) noprint;';
40240  put 'select';
40241  end;
40242  put fmtname;
40243  if last.memname then do;
40244  put '; run;';
40245  put 'proc append base=work._fmtdesc data=_temp force; run;';
40246  put 'proc delete data=_temp; run;';
40247  end;
40248  run;
40249  %include _sascode/source2;
40250 +data work._fmtdesc;
40251 +     length fmtname $32 label $3 type $1 start $16 end $16 min max default 3;
40252 +     stop;
40253 +     fmtname=" "; label=" "; type=" "; min=0; max=0; start=" "; end=" "; default=0;
40254 +     run;
40255 +proc format lib=WORK.FORMATS cntlout=_temp(keep=fmtname start end label min max default
40255!+type) noprint;
40256 +select
40257 +_BABB                     /* if many thousands, log window can slow things a lot */
40258 +_BAAB
40259 +_ABAB
40260 +_BBAB
40261 +_BAAA
40262 +_ABAA
40263 +_BABA
40264 +_AABA
40265 +_BBAA
40266 +_ABBA
40267 +_BBBA
40268 +_AAAA
40269 +_AAAB
40270 +_AABB
40271 +_ABBB
40272 +_BBBB
40273 +;
40273!+  run;
40274 +proc append base=work._fmtdesc data=_temp force; run;
40275 +proc delete data=_temp; run;
40276  run;
40277  filename _sascode clear;

The only way to avoid this session crushing log output is to send log output to a temporary file before opening VIEWTABLE -- a very unreasonable proposition in general.

Here is sample code that creates 10,000 custom formats and, on my system, it takes about 4 seconds to populate and present the list in the Format dialog raised via the Column Attributes dialog window ellipsis. Time delays get more than linearly worse as number of custom formats increases.

proc printto log=log;
run;

proc datasets noprint lib=work;
  delete formats / mt=catalog;
run;

%let N_FORMATS = 10000;
%let ROOT4 = %sysevalf((&N_FORMATS-1) ** 0.25, FLOOR);

%put &=ROOT4;

data cntlin;
  do l1 = 65 to 65+&ROOT4;
    c1 = byte(l1);
    do l2 = 65 to 65+&ROOT4;
      c2 = byte(l2);
      do l3 = 65 to 65+&ROOT4;
        c3 = byte(l3);
        do l4 = 65 to 65+&ROOT4;
          c4 = byte(l4);
          fmtname = '_' || cats(of c:);
          start = 0; label = 'No '; output;
          start = 1; label = 'Yes'; output;
        end;
      end;
    end;
  end;
  keep fmtname start label;
run;

proc sql noprint;
  select count(DISTINCT fmtname) into :count trimmed from cntlin;

%put &=count different custom formats;

options nonotes;

proc format cntlin = cntlin;
run;

options notes;

data have;
  call streaminit(123);
  array v v1-v200;
  do row = 1 to 150;
    do over v; v = rand('integer', 0, 1); end;
    output;
  end;
run;

options nonotes nosource2;

filename dump temp;
proc printto log=dump;
run;

dm 'vt have' viewtable;
1
votes

Sounds like whatever GUI tool you are using to view the data (VIEWTABLE perhaps?) does not have a good method for providing a list of formats. In that case I would try to avoid having to do that. So in general you should attach the format when you create the dataset. So if you defined your dataset this way:

data have;
  input id color ;
  format color chip_col.;
cards;
1 1
2 1
3 2 
;

Then when you open it in the viewer you will see:

ViewTable

If you want a method to see both the code and the decode check out this macro for creating new format definitions that include the code in the decode.

https://github.com/sasutils/macros/blob/master/cfmtgen.sas