0
votes

I was trying to delete lines in a text which contain any word in a list. For example:

File 1:

xxx yyy, zzz,
aaa bbb, sss,
ccc fff, zzz,
rrr www, qasd,

File 2:

xxx
zzz
rrr

The target is to delete the lines in file1 which contain any word in file2. So the output should be:

aaa bbb, sss,  

I know how to use sed with single word, like sed '/zzz/d' to delete lines containing zzz. But how it works in multiple words, or words in a file?

2

2 Answers

2
votes

You can do this easily with grep:

$ grep -Fwvf file2 file1
aaa bbb, sss,

Options:

-f FILE, --file=FILE

Obtain patterns from FILE, one per line. The empty file contains zero patterns, and therefore matches nothing. (-f is specified by POSIX.)

-v, --invert-match

Invert the sense of matching, to select non-matching lines. (-v is specified by POSIX.)

-w, --word-regexp

Select only those lines containing matches that form whole words. The test is that the matching substring must either be at the beginning of the line, or preceded by a non-word constituent character. Similarly, it must be either at the end of the line or followed by a non-word constituent character. Word-constituent characters are letters, digits, and the underscore.

-F, --fixed-strings

Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched. (-F is specified by POSIX.)

To store the changes back to file1:

$ grep -Fwvf file2 file1 > tmp && mv tmp file1
1
votes

try this:

grep -vFwf file2 file1