1
votes

I was given an XML file containing French language characters and received the following error: "An invalid character was found in text content XML." After searching around it seems this is a common error caused by the fact that XML is designed for UTF-8 encoding. I am not familiar with how to change the encoding being used, and although I have seen samples on here that contain a line stating the encoding the only non-data lines in my file is:

<tag> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xmlns:xsd="http://www.w3.org/2001/XMLSchema">

Is there any way to actually have these characters be interpreted correctly? I am attempting to import an XML file into Access, and in the end it would be best if I could preserve these characters.

2
I'm puzzled by the XML fragment you included. What is it supposed to be showing us?dcsohl
@dcsohl Sorry, I just meant to show that there is no line in the file that sets up the encoding (I recall seeing other files with a line that says something like "iso-8859-1"). I see now that this is below.114

2 Answers

2
votes

XML defaults to UTF-8 encoding, but it's very easy to declare a different encoding - easier than trying to muck around and change the encoding of your document.

If you start your XML document with the prolog

<?xml version="1.0" encoding="ISO-8859-1" ?>

you will be telling the XML parser NOT to use UTF-8, but rather to use ISO Latin1 (which is the most likely encoding that your document would actually be using.) Your characters will be preserved this way (always assuming Access honors the encoding statement, which it should).

0
votes

You can set the XML encoding by defining the header value of the file like this:

<?xml version="1.0" encoding="iso-8859-1"?>