689
votes

In a Git code repository I want to list all commits that contain a certain word. I tried this

git log -p | grep --context=4 "word"

but it does not necessarily give me back the filename (unless it's less that five lines away from the word I searched for. I also tried

git grep "word"

but it gives me only present files and not the history.

How do I search the entire history so I can follow changes on a particular word? I intend to search my codebase for occurrences of word to track down changes (search in files history).

8

8 Answers

1013
votes

If you want to find all commits where the commit message contains a given word, use

$ git log --grep=word

If you want to find all commits where "word" was added or removed in the file contents (to be more exact: where the number of occurrences of "word" changed), i.e., search the commit contents, use a so-called 'pickaxe' search with

$ git log -Sword

In modern Git there is also

$ git log -Gword

to look for differences whose added or removed line matches "word" (also commit contents).

Note that -G by default accepts a regex, while -S accepts a string, but it can be modified to accept regexes using the --pickaxe-regex.

To illustrate the difference between -S<regex> --pickaxe-regex and -G<regex>, consider a commit with the following diff in the same file:

+    return !regexec(regexp, two->ptr, 1, &regmatch, 0);
...
-    hit = !regexec(regexp, mf2.ptr, 1, &regmatch, 0);

While git log -G"regexec\(regexp" will show this commit, git log -S"regexec\(regexp" --pickaxe-regex will not (because the number of occurrences of that string did not change).


With Git 2.25.1 (Feb. 2020), the documentation is clarified around those regexes.

See commit 9299f84 (06 Feb 2020) by Martin Ågren (``). (Merged by Junio C Hamano -- gitster -- in commit 0d11410, 12 Feb 2020)

diff-options.txt: avoid "regex" overload in the example \

Reported-by: Adam Dinwoodie
Signed-off-by: Martin Ågren
Reviewed-by: Taylor Blau

When we exemplify the difference between -G and -S (using --pickaxe-regex), we do so using an example diff and git diff invocation involving "regexec", "regexp", "regmatch", etc.

The example is correct, but we can make it easier to untangle by avoiding writing "regex.*" unless it's really needed to make our point.

Use some made-up, non-regexy words instead.

The git diff documentation now includes:

To illustrate the difference between -S<regex> --pickaxe-regex and -G<regex>, consider a commit with the following diff in the same file:

+    return frotz(nitfol, two->ptr, 1, 0);
...
-    hit = frotz(nitfol, mf2.ptr, 1, 0);

While git log -G"frotz\(nitfol" will show this commit, git log -S"frotz\(nitfol" --pickaxe-regex will not (because the number of occurrences of that string did not change).

262
votes

git log's pickaxe will find commits with changes including "word" with git log -Sword

33
votes

After a lot of experimentation, I can recommend the following, which shows commits that introduce or remove lines containing a given regexp, and displays the text changes in each, with colours showing words added and removed.

git log --pickaxe-regex -p --color-words -S "<regexp to search for>"

Takes a while to run though... ;-)

11
votes

One more way/syntax to do it is: git log -S "word"
Like this you can search for example git log -S "with whitespaces and stuff @/#ü !"

11
votes

You can try the following command:

git log --patch --color=always | less +/searching_string

or using grep in the following way:

git rev-list --all | GIT_PAGER=cat xargs git grep 'search_string'

Run this command in the parent directory where you would like to search.

1
votes

vim-fugitive is versatile for that kind of examining in Vim.

Use :Ggrep to do that. For more information you can install vim-fugitive and look up the turorial by :help Grep. And this episode: exploring-the-history-of-a-git-repository will guide you to do all that.

1
votes

To use a Boolean connector on a regular expression:

git log --grep '[0-9]*\|[a-z]*'

This regular expression searches for the regular expression [0-9]* or [a-z]* in commit messages.

-1
votes

If you want search for sensitive data in order to remove it from your Git history (which is the reason why I landed here), there are tools for that. GitHub as a dedicated help page for that issue.

Here is the gist of the article:

The BFG Repo-Cleaner is a faster, simpler alternative to git filter-branch for removing unwanted data. For example, to remove your file with sensitive data and leave your latest commit untouched), run:

bfg --delete-files YOUR-FILE-WITH-SENSITIVE-DATA

To replace all text listed in passwords.txt wherever it can be found in your repository's history, run:

bfg --replace-text passwords.txt

See the BFG Repo-Cleaner's documentation for full usage and download instructions.