Complex matching across multiple lines

Question

been searching here and got close but seems like still not quite what i'm trying to do. eg. please consider following sample test input, the objective is to find matches that span multiple lines that start with line that contains "abc" (print this line), and ends with line that contains "efg" (also print this line), and also print the lines in between.

yyabc}
000
iiabc<
    {efg+1}
111
yyabc}
222
 p  {efg+13}
zzz
   z   {efg+243} {}
iii
oooabc>
ooo

The closest that came to meeting what i'm looking for is, with zzz as the test input file with above lines,

sed -e '/abc/,/efg/!d' zzz

, but includes extra lines, that wouldn't mind not being there,

yyabc}   <<***** extra
000      <<***** extra
iiabc<
    {efg+1}
yyabc}
222
 p  {efg+13}
oooabc>  <<***** extra
ooo      <<***** extra

, thus expected output is,

iiabc<
    {efg+1}
yyabc}
222
 p  {efg+13}

Besides relying on pcregrep (i have everything else in the linux box), is there a solution that can produce such multiple lines matching?

Thanks much.

John1024 John1024 · Accepted Answer · 2014-09-11T06:14:17

awk is well suited to this task. If you test input file is called zzz, then run:

$ awk '/abc/{a=""} /abc/,/efg/{a=a"\n"$0} /efg/{print substr(a,2);a=""}' zzz
iiabc<
    {efg+1}
yyabc}
222
 p  {efg+13}

Explanation:

/abc/{a=""}

Every time that a line containing "abc" is reached, set the variable a to an empty string. (The lines that we want to print will be added to this variable in the next step.)
/abc/,/efg/{a=a"\n"$0}

Over every range of lines that starts with a line containing abc and ends with a line containing efg, each line is appended to the variable a.
/efg/{print substr(a,2);a=""}

When the last line in the range is reached, print out a. Because a begins with an extra newline character, we use substr to remove it.

Without the first step above, the program runs fine but the "extra" lines would be printed. With the first step included, they are eliminated.

Complex matching across multiple lines

6 Answers