2
votes

I'm working with a .csv export of a workout app. It records the date, time, exercise, reps, weight, and comments. Like many .csv files, its a little messy.

Example:

Date,Time,Exercise,# of Reps,Weight,Comments12/23/2014,14:52,Hip Abduction,30,180,12/23/2014,14:52,Hip Abduction,30,180,12/23/2014,14:51,Inverse Bench,15,95,12/23/2014,14:51,Abb Twist,30,100,12/23/2014,14:51,Pull-Ups,5,170,12/23/2014,14:27,Squat,15,135,12/23/2014,14:27,Squat,15,13512/23/2014,14:27,Squat,15,13512/23/2014,14:27,Deadlift,15,135,12/23/2014,14:27,Crunch,30,170,12/23/2014,14:27,Crunch,30,17012/23/2014,14:27,Crunch,30,17012/23/2014,14:26,Bench,15,135,12/23/2014,14:26,Bench,15,13512/23/2014,14:26,Bench,15,135,etc...

I've been able to import the data, however, it is limiting the number of characters to 8; as seen in the date and exercise variables.

SAS Code:

DATA strength;
    infile 'C:\Users\user\Google Drive\strength.csv' DLM = ',' DSD;
    input Date $ Time $ Exercise $ Reps Weight Comments;
    if date = 'Date' then delete;   *removes first obs - the csv header;
RUN;

PROC PRINT data = strength;
    title 'Simple Work Out Log Export';
RUN;

SAS Output:

Obs      Date      Time     Exercise    Reps    Weight    Comments

     1    12/23/20    14:52    Hip Abdu      30      180         .
     2    12/23/20    14:52    Hip Addu      30      180         .
     3    12/23/20    14:51    Inverse       15       95         .
     4    12/23/20    14:51    Abb Twis      30      100         .
     etc...

I don't have a lot of experience working with .csv files, but I did try using

input Date $ Time $ Exercise $ 12. ....

but that didn't work because different exercises have different length names.

How would I go about importing the full date and exercise name for a raw .csv data file like this?

Thanks!

3

3 Answers

1
votes

You can declare the length of the variables in an INFORMAT statement.

Try this before your input.

informat var1 $X1. var2 $X2.;

substituting var1 and var2 for the variable names and X1 and X2 with the length.

You can read in the date and time as actual date and time types with the informat. You can also specify the read to start on line 2 with FIRSTOBS=2 on the infile statement.

I you can look up all of these statements in the SAS documentation online at http://support.sas.com/documentation/onlinedoc/base/index.html.

Use of the informat statement for the date and time can be found here http://support.sas.com/documentation/cdl/en/leforinforref/64790/HTML/default/viewer.htm#n0cq8eha2o93mdn1lg8n5ursmkxm.htm

1
votes

It's also possible to write a slightly more complex input statement that does what you want without using an informat statement:

input date :mmyydd10. time :time. Exercise :$32. reps :8. weight :8.;

You can then apply date and time formats:

format date mmddyy10. time time.;
0
votes

Use colon operator here

You see here that there is a colon preceding each informat. This colon (called an informat modifier) tells SAS to use the informat supplied but to stop reading the value for this variable when a delimiter is encountered. Do not forget the colons because without them SAS may read past a delimiter to satisfy the width specified in the informat.

DATA strength;
format date mmddyy10. time time.;
    infile 'C:\Users\user\Google Drive\strength.csv' DLM = ',' DSD;
    input date : mmyydd10. 
          time : time. 
          Exercise : $50. /*Maximum length of Execrise in your dataset,have assumed 50*/
          reps : 8. 
          weight : 8.
          Comments : $50.; /*Maximum length of comments in your dataset,have assumed 50*/

    if date = 'Date' then delete;   *removes first obs - the csv header;
RUN;

PROC PRINT data = strength;
    title 'Simple Work Out Log Export';
RUN;