0
votes

I have a program log value in string (entire log is coming in single line), i would like to convert into multi-line, awk would do this definately but how to loop through in single line ?

I have below code in bash (where str containing entire logging string generated by program, in just a single line)

str="2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - Start of job execution 2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - exec(0, 0, START.0) 2019/04/24 23:26:42 - START - Starting job entry 2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - Starting entry [Call_Param_File] 2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - exec(1, 0, Call_Param_File.0) 2019/04/24 23:26:42 - Call_Param_File - Starting job entry 
 - blah blah blah..."
echo $str|awk 'BEGIN { ORS=" \n "}; { printf "%s %s %s", $1,$2,$3}'

The above awk command will do is print initial three values of log text which separated by " - ". but this has to be done in loop since i am expecting output as below, which has date or timestamp and short message and followed by long message strings.

2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - Start of job execution 
2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - exec(0, 0, START.0) 
2019/04/24 23:26:42 - START - Starting job entry 
2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - Starting entry [Call_Param_File] 
2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - exec(1, 0, Call_Param_File.0) 
2019/04/24 23:26:42 - Call_Param_File - Starting job entry - blah blah blah...

How we can do this using awk?

str="2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - Start of job execution 2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - exec(0, 0, START.0) 2019/04/24 23:26:42 - START - Starting job entry 2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - Starting entry [Call_Param_File] 2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - exec(1, 0, Call_Param_File.0) 2019/04/24 23:26:42 - Call_Param_File - Starting job entry 
 - blah blah blah..."
echo $str|awk 'BEGIN { ORS=" \n "}; { printf "%s %s %s", $1,$2,$3}'

Final result expected is:-

2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - Start of job execution 
2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - exec(0, 0, START.0) 
2019/04/24 23:26:42 - START - Starting job entry 
2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - Starting entry [Call_Param_File] 
2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - exec(1, 0, Call_Param_File.0) 
2019/04/24 23:26:42 - Call_Param_File - Starting job entry - blah blah blah...
4
entire log is coming in single line are you sure about that? does the program really print logs in a single line or do you truncate line feeds while assigning its output to a variable? - oguz ismail
Also are you REALLY doing echo $str instead of echo "$str"? That alone would make it appear like your text is all on one line as it'd convert every sequence of white space to a blank char as it'd be passing every non-blank string one at a time to echo (after performing globbing, etc.). See mywiki.wooledge.org/Quotes. - Ed Morton
@oguzismail: i am not entire sure about that, it could be possible i am making mistake, i will try Ed Morton suggestion of removing double quotes. Thanks for your inputs - Mateen Syed

4 Answers

0
votes

tried on gnu awk

awk -vRS='([0-9]{2,4}/?){3}' '{printf $0"\n"RT}' <<<$str

tried on gnu sed

 sed -E 's/([0-9]{2,4}\/?){3}/\n&/g'<<<$str
0
votes

Could you please try following(tested with provided samples only).

echo "$str" | awk '{val=$1;$1="";gsub(/[0-9]+\/[0-9]+\/[0-9]+/,ORS "&");print val $0}'

EDIT: Adding @Corentin's comment version too here:

echo $str | awk '{print gensub(/.([0-9\/]{10})/, "\n\\1", "g")}'
0
votes

Since it's April, and it's a bash string, then a bash substitution kludge might be sufficient:

echo "${str// 2019/$'\n'2019}"

Output:

2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - Start of job execution
2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - exec(0, 0, START.0)
2019/04/24 23:26:42 - START - Starting job entry
2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - Starting entry [Call_Param_File]
2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - exec(1, 0, Call_Param_File.0)
2019/04/24 23:26:42 - Call_Param_File - Starting job entry

Note: Since bash's string substitution is less versatile than sed and awk's, this code would fail if it were New Years Eve because the substitution would miss lines starting with 2020/01/01. Provided the log lines don't contain the string "20", (note the leading space), this might be good for the next 80 years:

echo "${str// 20/$'\n'20}"
0
votes

Given this input:

$ str='2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - Start of job execution 2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - exec(0, 0, START.0) 2019/04/24 23:26:42 - START - Starting job entry 2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - Starting entry [Call_Param_File] 2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - exec(1, 0, Call_Param_File.0) 2019/04/24 23:26:42 - Call_Param_File - Starting job entry - blah blah blah...'

With GNU awk for multi-char RS and RT:

$ echo "$str" | awk -v RS='[0-9/]{10} [0-9:]{8} |\n' 'NR>1{print p $0} {p=RT}'
2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - Start of job execution
2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - exec(0, 0, START.0)
2019/04/24 23:26:42 - START - Starting job entry
2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - Starting entry [Call_Param_File]
2019/04/24 23:26:42 - Main_Cons_Job_edw_cc_sf_accts_assets_feed - exec(1, 0, Call_Param_File.0)
2019/04/24 23:26:42 - Call_Param_File - Starting job entry - blah blah blah...