0
votes

I have a text file which looks like this

    1
    bbbbb
    aaa
    END
    2
    ttttt
    mmmm
    uu
    END
    3
    ....
    END

The number of lines between the single number patterns (1,2,3) and END is variable. So the upper delimiting pattern changes, but the final one does not. Using some bash commands, I would like to grep lines between a specified upper partner and the corresponding END, for example a command that takes as input 2 and returns

    2
    ttttt
    mmmm
    uu
    END

I've tried various solutions with sed and awk, but still can't figure it out. The main problem is that I may need to grep a entry in the middle of the file, so I can't use sed with /pattern/q...Any help will be greatly appreciated!

5

5 Answers

1
votes

With awk we set a flag f when matching the start pattern, which is an input argument. After that row, the flag is on and it prints every line. When reaching "END" (AND the flag is on!) it exits.

awk -v p=2 '$0~p{f=1} f{print} f&&/END/{exit}' file
0
votes

Use sed and its addresses to only print a part of the file between the patterns:

#!/bin/bash
start=x
while [[ $start = *[^0-9]* ]] ; do
    read -p 'Enter the start pattern: ' start
done
sed -n "/^$start$/,/^END$/p" file
0
votes

You can use the sed with an address range. Modify the first regular expression (RE1) in /RE1/,/RE2/ as your convenience:

sed -n '/^[[:space:]]*2$/,/^[[:space:]]*END$/p' file

Or,

sed '
    /^[[:space:]]*2$/,/^[[:space:]]*END$/!d
    /^[[:space:]]*END$/q
' file

This quits upon reading the END, thus may be more efficient.

0
votes

Another option/solution using just bash

#!/usr/bin/env bash

start=$1

while IFS= read -r lines; do
  if [[ ${lines##* } == $start ]]; then
    print=on
  elif [[ ${lines##* } == [0-9] ]]; then
    print=off
  fi
  case $print in on) printf '%s\n' "$lines";; esac
done < file.txt

Run the script with the number as the argument, 1 can 2 or 3 or ...

./myscript 1
0
votes

This might work for you (GNU sed):

sed -n '/^\s*2$/{:a;N;/^\s*END$/M!ba;p;q}' file

Switch off implicit printing by setting the -n option.

Gather up the lines beginning with a line starting with 2 and ending in a line starting with END, print the collection and quit.

N.B. The second regexp uses the M flag, which allows the ^ and $ to match start and end of lines when multiple lines are being matched. Another thing to bear in mind is that using a range i.e. sed -n '/start/,/end/p' file, will start printing lines the moment the first condition is met and if the second match does not materialise, it will continue printing to the end of the file.