0
votes

I've got lots of data in SAS, one of the columns has free text descriptions and some of these have got carriage returns / new lines in them that I'd like to strip out and replace with spaces.

I've got some code that is removing the carriage returns / new lines, but it isn't really helpful, because this isn't replacing it with anything, so its just merging lines and words are ending up being mushed together.

As an example I've got the following data (well, except that those last two lines should actually be one line, not sure how to input that in here properly (any help with that would be appreciated!)):

ID Description Col3 Col4 Col5 Col6
1 bla bla bla C1 0 100 0
2 got up tear C1 0 0 0
3 free text C1 10 100 0
4 house roof tree C1 10 100 0
5 house roof tree C1 10 0 0
6 sky computer mug
mug joule
C3 0 20 1

And I want:

ID Description Col3 Col4 Col5 Col6
1 bla bla bla C1 0 100 0
2 got up tear C1 0 0 0
3 free text C1 10 100 0
4 house roof tree C1 10 100 0
5 house roof tree C1 10 0 0
6 sky computer mug mug joule C3 0 20 1

But I'm getting:

ID Description Col3 Col4 Col5 Col6
1 bla bla bla C1 0 100 0
2 got up tear C1 0 0 0
3 free text C1 10 100 0
4 house roof tree C1 10 100 0
5 house roof tree C1 10 0 0
6 sky computer mugmug joule C3 0 20 1

Code below... but I can't figure out how to actually input some sample data in with a carriage return / new line in it. Does anyone know how to do that? But entry number 6 should have a carriage return / new line in it.

data data_1;
Infile Datalines dlm=',';
format Description $char50.;
input ID Description $ Col3 $ Col4 Col5 Col6;
datalines;
1,bla bla bla,C1,0,100,0
2,got up tear,C1,0,0,0
3,free text,C1,10,100,0
4,house roof tree,C1,10,100,0
5,house roof tree,C1,10,0,0
6,sky computer mug 
mug joule,C3,0,20,1
;
run;


/* removes new line but doesn't replace it with a space */
data data_2;
set data_1;
Description = COMPRESS(Description, , "kw");
run;

Any help would be greatly appreciated.

1
This is one of the few use cases where I say save the data in Excel and import the Excel file.Reeza
Regarding question composition, use <BR> in your markup to add a newline to a data value in a table cell.Richard

1 Answers

1
votes

Instead of removing the characters using COMPRESS() replace them using TRANSLATE() instead.

If you just want to get rid of the CR and LF characters use:

description = translate(description,'  ','0D0A'x);

Or if you want to replace with spaces the characters the 'w' option to compress will select then use this:

description=translate(description,' ',compress(description, ,'w'));

PS If you want to introduce some CR characters into your test data you can also use TRANSLATE() to replace some other character, like a pipe.

data have;
  infile datalines dsd truncover;
  input ID Description :$50. Col3 $ Col4 Col5 Col6;
  description=translate(description,'0D'x,'|');
datalines;
1,bla bla bla,C1,0,100,0
2,got up tear,C1,0,0,0
3,free text,C1,10,100,0
4,house roof tree,C1,10,100,0
5,house roof tree,C1,10,0,0
6,sky computer mug |mug joule,C3,0,20,1
;