3
votes

We have many xml files and most of them are UTF-16 encoded.

When I add a file to subversion using SmartSVN it always gets the svn:mime-type=application/octet-stream.

This prevents the visual DIFF tool from SmartSVN to work on these files which is very annoying.

From the SVN FAQ: http://subversion.apache.org/faq.html#binary-files

When you first add or import a file into Subversion, the file is examined to determine if it is a binary file. Currently, Subversion just looks at the first 1024 bytes of the file; if any of the bytes are zero, or if more than 15% are not ASCII printing characters, then Subversion calls the file binary. This heuristic might be improved in the future, however.

This is very stupid when UTF-16 files are used, because they contain ~50% zero in most cases.

I also read that there is a ways to set properties automatically from the SVN client: http://www.mediawiki.org/wiki/Subversion/auto-props

Does this also allow to remove the auto detected binary mime type?

Is it possible to set this on the repository / svn server somehow, so I don't have to set it on every workstation?

2

2 Answers

3
votes
  1. You can't redefine mime-type for UTF-16 XMLs only (but can - for all xmls)
  2. I don't know good way of redefining mim-type on server or on per-repository basis

If the above points do not frighten you, you can globally (per client's host) redefine mime-type on client's subversion config files: %AppData%\Subversion\config, [auto-props] section. Something like

*.xml = svn:mime-type=text/xml
2
votes

svn:mime-type is a SVN property so you should be able to modify it. If you set it on one machine and commit it, it should reflect on the other machines after they update.

However the "binary-file paradigm" is a strong part of SVN's internal workings and especially the commit algorithm which is currently not changeable. The diffs are actually kept in binary and appended to every file in the repo. Hmm, this is what I remember from the SVN 1.6 documentation. So I am not sure if you can change the "auto mime-type" application.

You can use a hook (maybe post-commit?) to detect the file being commited on some criteria and apply a property change for that(those) files after they have been commited. You can for sure utilize the hooks sub-system to do this for you with some codding of course. (I do not know what SVN version you're using so I provided the link for the latest stable version - 1.7 book)

Oh and a little copy/paste from the SVN docs:

To determine whether a contextual merge is possible, Subversion examines the svn:mime-type property. If the file has no svn:mime-type property, or has a MIME type that is textual (e.g., text/*), Subversion assumes it is text. Otherwise, Subversion assumes the file is binary. Subversion also helps users by running a binary-detection algorithm in the svn import and svn add commands. These commands will make a good guess and then (possibly) set a binary svn:mime-type property on the file being added. (If Subversion guesses wrong, the user can always remove or hand-edit the property.)

So the short answer is you will maybe not be able to force SVN to auto-detect this, but you will be able to program it to do so. :)

Hope this helped.