0
votes

I am trying to convert the following base table into another table. The table has date and a value element.

    Date Percent
201801  0.09
201802  0.04
201803  0.09
201804  0.06
201805  0.09
201806  0.07
201807  0.07
201808  0.01
201809  0.1
201810  0.03
201811  0.08
201812  0.01
201901  0.08
201902  0.06
201903  0.1
201904  0.1
201905  0.02
201906  0.1
201907  0.01
201908  0.03
201909  0.06
201910  0.08

to the following:

    2018    2019
Jan 0.09    0.08
Feb 0.04    0.06
Mar 0.09    0.1
Apr 0.06    0.1
May 0.09    0.02
Jun 0.07    0.1
Jul 0.07    0.01
Aug 0.01    0.03
Sep 0.1     0.06
Oct 0.03    0.08
Nov 0.08    
Dec 0.01    

Is there a procedure that can do it easily? say proc tabulate etc. If there is a code that can do it in a simple manner, please let me know.

thank you!

2
PROC TRANSPOSE but do you only have two years of data?Reeza
I actually have 5 years of data, but have shortened it for this example. How do we transpose date to show Jan through Dec and split year from it? I saw an example somewhere that one of the procedures does that, but can't recollect which one it is.Zenvega

2 Answers

2
votes

Transforming a dataset from having categorical data values (the date column) into a dataset with the same values as metadata (column names one per year) can be problematic for additional downstream processing.

The usual consumption is to create a report in the cross dimensional form and leave things as that.

For example:

data have (label="5 years of monthly values");
  do date = '01jan2015'd to '31dec2019'd;
    date = intnx ('month', date, 0, 'E');
    percent = (year(date) + month(date) / 100) / 1e4;
    output;
  end;

  format date yymmd8.;
  format percent 8.6;
run;

ods listing;
options pagesize=21 nocenter nodate nonumber; title; 
proc report data=have panels=3;
  columns date percent;
  define percent / display;
run;
------------------------- output -------------------------
    date   percent          date   percent          date   percent
 2015-01  0.201501       2016-09  0.201609       2018-05  0.201805
 2015-02  0.201502       2016-10  0.201610       2018-06  0.201806
 2015-03  0.201503       2016-11  0.201611       2018-07  0.201807
 2015-04  0.201504       2016-12  0.201612       2018-08  0.201808
 2015-05  0.201505       2017-01  0.201701       2018-09  0.201809
 2015-06  0.201506       2017-02  0.201702       2018-10  0.201810
 2015-07  0.201507       2017-03  0.201703       2018-11  0.201811
 2015-08  0.201508       2017-04  0.201704       2018-12  0.201812
 2015-09  0.201509       2017-05  0.201705       2019-01  0.201901
 2015-10  0.201510       2017-06  0.201706       2019-02  0.201902
 2015-11  0.201511       2017-07  0.201707       2019-03  0.201903
 2015-12  0.201512       2017-08  0.201708       2019-04  0.201904
 2016-01  0.201601       2017-09  0.201709       2019-05  0.201905
 2016-02  0.201602       2017-10  0.201710       2019-06  0.201906
 2016-03  0.201603       2017-11  0.201711       2019-07  0.201907
 2016-04  0.201604       2017-12  0.201712       2019-08  0.201908
 2016-05  0.201605       2018-01  0.201801       2019-09  0.201909
 2016-06  0.201606       2018-02  0.201802       2019-10  0.201910
 2016-07  0.201607       2018-03  0.201803       2019-11  0.201911
 2016-08  0.201608       2018-04  0.201804       2019-12  0.201912

Create a report

The TABULATE procedure needs a separate variable for each dimensional use (specified with CLASS statement). Since the same date value is contributing to the across and down dimensions, the date value needs to be repeated. This can be done with a data set view and thus would not required copying the original data into a new data set.

The formatted values of the class variables are used in the tabulation rendering.

data have2 / view=have2;
  set have;
  date2 = date;
run;

proc tabulate data=have2;
  class date date2;
  format date monname3.;
  format date2 year.;
  var percent;
  table date='', date2='' * percent='' * sum='' * f=8.6;
run;

enter image description here

Data transformation

If you still want an actual data set with the year columns, the TRANSPOSE procedure would also require incoming data the date value in two forms -- the first as month (for BY processing that becomes the row 'id' and the second as year that become the column name via ID statement.

proc sql;
  create view have_sorted_by_month_year as
  select 
    put(date,monname3.) as month_name,
    year(date) as year,
    percent 
  from have
  order by month(date), year(date);
quit;

proc transpose data=have_sorted_by_month_year out=want(drop=_name_);
  by month_name notsorted;
  var percent;
  id year;
run;

proc print data=want;
run;

----------------------------------

       month_
Obs     name        _2015       _2016       _2017       _2018       _2019

  1     Jan      0.201501    0.201601    0.201701    0.201801    0.201901
  2     Feb      0.201502    0.201602    0.201702    0.201802    0.201902
  3     Mar      0.201503    0.201603    0.201703    0.201803    0.201903
  4     Apr      0.201504    0.201604    0.201704    0.201804    0.201904
  5     May      0.201505    0.201605    0.201705    0.201805    0.201905
  6     Jun      0.201506    0.201606    0.201706    0.201806    0.201906
  7     Jul      0.201507    0.201607    0.201707    0.201807    0.201907
  8     Aug      0.201508    0.201608    0.201708    0.201808    0.201908
  9     Sep      0.201509    0.201609    0.201709    0.201809    0.201909
 10     Oct      0.201510    0.201610    0.201710    0.201810    0.201910
 11     Nov      0.201511    0.201611    0.201711    0.201811    0.201911
 12     Dec      0.201512    0.201612    0.201712    0.201812    0.201912
0
votes
DATA TEST;
   INPUT Date $ Percent;
   DATE1=COMPRESS(DATE||'01');
   DATEN=INPUT(DATE1,YYMMDD8.);
   FORMAT DATEN DATE9.;
   YEAR = YEAR(DATEN);
   MONTH = PUT(DATEN, MONNAME3.);
   DATALINES;
201801  0.09
201802  0.04
201803  0.09
201804  0.06
201805  0.09
201806  0.07
201807  0.07
201808  0.01
201809  0.1
201810  0.03
201811  0.08
201812  0.01
201901  0.08
201902  0.06
201903  0.1
201904  0.1
201905  0.02
201906  0.1
201907  0.01
201908  0.03
201909  0.06
201910  0.08
;
RUN;

PROC SORT DATA=TEST;BY MONTH;
PROC TRANSPOSE DATA=TEST OUT=TESTT(DROP=_NAME_);
   BY MONTH;
   ID YEAR;
   VAR PERCENT;
RUN;

PROC PRINT DATA=TESTT NOOBS; RUN;

enter image description here