Edit: git does not mess with character encoding. This is still here to share knowlege and avoid others making the same mistake.
The context: My enterprise uses an svn repository. I'm using git-svn as a client to interact with this repository. All text files in the project are (and must be) encoded with windows default encoding (cp-....). I use git-extensions, and sometimes the command line to pilot git.
What I did: During the last 3 days, I was working on a new feature, and I did a number of local commits. Finally i squashed all these commits into a single one using an interactive rebase, then i used git svn dcommit to push everything on the svn repository in a single commit.
What happened then: A collegue told me that all accents were messed up in the files that I modified, and in the new files after my commit. I had already commited text files with accents in the same repository with my installation of git + svn before, and it's the first time I face this issue.
My investigation:I did the following things to investigate: opened the files with notepad++, and tried the most current encodings (including windows default and UTF-8) to view them: none of them could display accents properly, and different accents are always rendered by the same sequence of strange glyphs.
The temporary workaround:I quickly created a revert commit with git extension and "dcommited" it.
The question:My enterprise svn repository is OK, but now i have the two following problems to solve:
- Understand what happened with the characters with accents
- Retrieve my work from the SVN history and commit it in a proper way (if possible without reviewing manually all the characters with accents)
Can anybody provide some clues (i'm rather new to git) ?