181
votes

Some time ago I added info(files) that must be private. Removing from the project is not problem, but I also need to remove it from git history.

I use Git and Github (private account).

Note: On this thread something similar is shown, but here is an old file that was added to a feature branch, that branch merged to a development branch and finally merged to master, since this, a lot of changes was done. So it's not the same and what is needed is to change the history, and hide that files for privacy.

7
You would have to rewrite history. For example git rebase then git push -fCory Kramer
The filter-branch method described in the suggested duplicate will do what you want.1615903
Also stackoverflow.com/a/17890278 which points to the BFG which can be faster than using git filter-branchHasturkun
but just go faster and do the same, and need to use java i read @HasturkunMarcos R. Guevara

7 Answers

217
votes

I have found this answer and it helped:

git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch path_to_file" HEAD

Found it here https://myopswork.com/how-remove-files-completely-from-git-repository-history-47ed3e0c4c35

79
votes

If you have recently committed that file, or if that file has changed in one or two commits, then I'd suggest you use rebase and cherrypick to remove that particular commit.

Otherwise, you'd have to rewrite the entire history.

git filter-branch --tree-filter 'rm -f <path_to_file>' HEAD

When you are satisfied with the changes and have duly ensured that everything seems fine, you need to update all remote branches -

git push origin --force --all

Note:- It's a complex operation, and you must be aware of what you are doing. First try doing it on a demo repository to see how it works. You also need to let other developers know about it, such that they don't make any change in the mean time.

29
votes

git-repo-filter

git recommends to use the third-party add-on git-filter-repo (when git filter-branch command is executed). There is a long list of why it is better than any other alternatives (https://github.com/newren/git-filter-repo#why-filter-repo-instead-of-other-alternatives), my experience is that it is very simple and very fast.

This command removes the file from all commits in all branches:

git filter-repo --path <path to the file or directory> --invert-paths

Multiple paths can be specified by using multiple --path parameters. You can find detailed documentation here: https://www.mankier.com/1/git-filter-repo

22
votes

Remove the file and rewrite history from the commit you done with the removed file(this will create new commit hash from the file you commited):

there are two ways:

  1. Using git-filter-branch:

git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch <path to the file or directory>' --prune-empty --tag-name-filter cat -- --all

  1. Using git-filter-repo:
pip3 install git-filter-repo
git filter-repo --path <path to the file or directory> --invert-paths

now force push the repo: git push origin --force --all and tell your collaborators to rebase.

16
votes

I read this GitHub article, which led me to the following command (similar to the accepted answer, but a bit more robust):

git filter-branch --force --index-filter "git rm --cached --ignore-unmatch PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA" --prune-empty --tag-name-filter cat -- --all
8
votes

Using the bfg repo-cleaner package is another viable alternative to git-filter-branch. Apparently, it is also faster...

8
votes
  • First of all, add it to your .gitignore file and don't forget to commit the file :-)

  • You can use this site: http://gitignore.io to generate the .gitignore for you and add the required path to your binary files/folder(s)

  • Once you added the file to .gitignore you can remove the "old" binary file with BFG.


#How to remove big files from the repository

You can use git filter-branch or BFG. https://rtyley.github.io/bfg-repo-cleaner/

###BFG Repo-Cleaner an alternative to git-filter-branch.

The BFG is a simpler, faster alternative to git-filter-branch for cleansing bad data out of your Git repository history:

*** Removing Crazy Big Files***

  • Removing Passwords, Credentials & other Private data

Examples (from the official site)

In all these examples bfg is an alias for java -jar bfg.jar.

# Delete all files named 'id_rsa' or 'id_dsa' :
bfg --delete-files id_{dsa,rsa}  my-repo.git

enter image description here