0
votes

I am trying to find a way to search from a string within a text to another string.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus massa nulla, lobortis sit amet placerat hendrerit, mollis quis nulla. Morbi consectetur, odio vel rhoncus euismod, nunc nisi euismod ante, vitae molestie ante nulla non est.

Vivamus eget fermentum lorem, sed suscipit nulla. Aliquam consequat ultrices maximus. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas.

Etiam vitae tortor quis lectus convallis ullamcorper. Nullam nec dignissim tellus, vel dictum nisi. Etiam sit amet libero vulputate, eleifend libero nec, semper ex. Cras eu magna fringilla, iaculis sapien id, feugiat lorem. Ut id velit mauris.

I'd like to know if there is a way, in the command line, to come up with what's in between the bolded words in the paragraph above. I've tried different variations of grep without any success.

4
Is the sample data all one line, or is it broken up by line-feeds? Simple if all-one-line, not so simple otherwise. Also, please include some of the grep attempts, otherwise we are expected to guess. Good luck.shellter
I tried grep -e 'amet .* tellus' myfile but it didn't work. I forgot to mention that there are new lines in betweenTRod

4 Answers

3
votes

You can use parameter substitution:

#!/bin/bash

string=$( <dat/lorem.txt )
tmp=${string#*amet}
tmp=${tmp%tellus*}

echo $tmp

output:

$ string=$( <dat/lorem.txt ); tmp=${string#*amet}; tmp=${tmp%tellus*}; echo $tmp
, consectetur adipiscing elit. Phasellus massa nulla, lobortis sit amet placerat hendrerit,
mollis quis nulla. Morbi consectetur, odio vel rhoncus euismod, nunc nisi euismod ante,
vitae molestie ante nulla non est. Vivamus eget fermentum lorem, sed suscipit nulla.
Aliquam consequat ultrices maximus. Pellentesque habitant morbi tristique senectus et
netus et malesuada fames ac turpis egestas. Etiam vitae tortor quis lectus convallis
ullamcorper. Nullam nec dignissim
2
votes

by one sed command

sed -n ':a;$!{N;ba};s/.*\(amet.*tellus\).*/\1/p' infile

amet placerat hendrerit, mollis quis nulla. Morbi consectetur, odio vel rhoncus euismod, nunc nisi euismod ante, vitae molestie ante nulla non est.

Vivamus eget fermentum lorem, sed suscipit nulla. Aliquam consequat ultrices maximus. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas.

Etiam vitae tortor quis lectus convallis ullamcorper. Nullam nec dignissim tellus

If you needn't the keywords in output:

sed -n ':a;$!{N;ba};s/.*amet\(.*\)tellus.*/\1/p' infile

 placerat hendrerit, mollis quis nulla. Morbi consectetur, odio vel rhoncus euismod, nunc nisi euismod ante, vitae molestie ante nulla non est.

Vivamus eget fermentum lorem, sed suscipit nulla. Aliquam consequat ultrices maximus. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas.

Etiam vitae tortor quis lectus convallis ullamcorper. Nullam nec dignissim
1
votes

Use sed for this one :

sed -n '/amet/,/tellus/p' atext.txt | sed 's/.*amet/amet/;s/tellus.*/tellus/'

Output should look like:

amet ...(everything in between)...tellus

The first sed deletes all lines except the ones that include the words amet and tellus, and everything in between.

The second sed deletes all words before amet, and all words after tellus

1
votes

You can use a regular expression

grep -e 'amet .* tellus' yourfile

Where . matches any character except line breaks, and * means 0 or more times.