144
votes

I want to search for files containing DOS line endings with grep on Linux. Something like this:

grep -IUr --color '\r\n' .

The above seems to match for literal rn which is not what is desired.

The output of this will be piped through xargs into todos to convert crlf to lf like this

grep -IUrl --color '^M' . | xargs -ifile fromdos 'file'
9
Have you tried dos2unix? It fixes line endings automatically.sblundy
I'm not quite sure but iirc there's a difference between quoting the pattern inside ' and ". Afaik in patterns enclosed in ' the escape sequences are interpreted as proper string so '\r' would be equivalent to "\\r" and "\r" has no equivalent (at least in that notation) with '.Anticom
Anticom: You're correct in this case that the difference between ' and " is irrelevant; however, generally they are distinct as ' surrounded strings are weak quoted, and " are strong quoted. The biggest thing I take advantage of is that $ expansions or `` don't expand in weak quoted strings. See bash-hackers on quoting for more.bschlueter
Easiest way is to use modern dos2unix with -ic switch. For LF files you may search with unix2dos-ic. It doesn't modify files. Only report.gavenkoa
since this is a top answer for any question regarding Windows line endings/carriage returns on Linux, I think its worth noting that you can see them in the terminal with the command cat -v somefile.txt; they show up as ^Muser5359531

9 Answers

186
votes

grep probably isn't the tool you want for this. It will print a line for every matching line in every file. Unless you want to, say, run todos 10 times on a 10 line file, grep isn't the best way to go about it. Using find to run file on every file in the tree then grepping through that for "CRLF" will get you one line of output for each file which has dos style line endings:

find . -not -type d -exec file "{}" ";" | grep CRLF

will get you something like:

./1/dos1.txt: ASCII text, with CRLF line terminators
./2/dos2.txt: ASCII text, with CRLF line terminators
./dos.txt: ASCII text, with CRLF line terminators
130
votes

Use Ctrl+V, Ctrl+M to enter a literal Carriage Return character into your grep string. So:

grep -IUr --color "^M"

will work - if the ^M there is a literal CR that you input as I suggested.

If you want the list of files, you want to add the -l option as well.

Explanation

  • -I ignore binary files
  • -U prevents grep from stripping CR characters. By default it does this it if it decides it's a text file.
  • -r read all files under each directory recursively.
63
votes

Using RipGrep (depending on your shell, you might need to quote the last argument):

rg -l \r
-l, --files-with-matches
Only print the paths with at least one match.

https://github.com/BurntSushi/ripgrep

18
votes

If your version of grep supports -P (--perl-regexp) option, then

grep -lUP '\r$'

could be used.

9
votes
# list files containing dos line endings (CRLF)

cr="$(printf "\r")"    # alternative to ctrl-V ctrl-M

grep -Ilsr "${cr}$" . 

grep -Ilsr $'\r$' .   # yet another & even shorter alternative
3
votes

You can use file command in unix. It gives you the character encoding of the file along with line terminators.

$ file myfile
myfile: ISO-8859 text, with CRLF line terminators
$ file myfile | grep -ow CRLF
CRLF  
3
votes

dos2unix has a file information option which can be used to show the files that would be converted:

dos2unix -ic /path/to/file

To do that recursively you can use bash’s globstar option, which for the current shell is enabled with shopt -s globstar:

dos2unix -ic **      # all files recursively
dos2unix -ic **/file # files called “file” recursively

Alternatively you can use find for that:

find -type f -exec dos2unix -ic {} +            # all files recursively (ignoring directories)
find -name file -exec dos2unix -ic {} + # files called “file” recursively
2
votes

The query was search... I have a similar issue... somebody submitted mixed line endings into the version control, so now we have a bunch of files with 0x0d 0x0d 0x0a line endings. Note that

grep -P '\x0d\x0a'

finds all lines, whereas

grep -P '\x0d\x0d\x0a'

and

grep -P '\x0d\x0d'

finds no lines so there may be something "else" going on inside grep when it comes to line ending patterns... unfortunately for me!

1
votes

If, like me, your minimalist unix doesn't include niceties like the file command, and backslashes in your grep expressions just don't cooperate, try this:

$ for file in `find . -type f` ; do
> dump $file | cut -c9-50 | egrep -m1 -q ' 0d| 0d'
> if [ $? -eq 0 ] ; then echo $file ; fi
> done

Modifications you may want to make to the above include:

  • tweak the find command to locate only the files you want to scan
  • change the dump command to od or whatever file dump utility you have
  • confirm that the cut command includes both a leading and trailing space as well as just the hexadecimal character output from the dump utility
  • limit the dump output to the first 1000 characters or so for efficiency

For example, something like this may work for you using od instead of dump:

 od -t x2 -N 1000 $file | cut -c8- | egrep -m1 -q ' 0d| 0d|0d$'