4
votes

I am trying to send xml to a java-based web service given to me by a third party via a c#.NET application, and I get the org.xml.sax.SAXParseException: Content is not allowed in prolog error.

I have verified the xml against the schema, and I passed the memorystream I am using to hold the xml to an .xml file, then opened the file with a hex editor to make sure that there were no undesired characters in the prolog, and there are none. When opened, the first characters in the file are

<?xml version="1.0" encoding="utf-8"?>

The class I was given to use to send the xml data to the web service accepts a byte array. I figure that creating the xml using an XmlTextWriter to a utf-8 encoded memorystream, then sending the contents of the stream to a byte array is the most direct method.

I have done a lot of research and tried all the possbilities around this issue that I could find, but nothing works. Could someone please help? Thanks in advance.

By the way, here is a portion of what the web service returns to me. In the payload of the SOAP message, should the data after the element look like that, or be readable xml like the content before it?

Messages:
Message:

Payload: <?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http:/
/schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema
-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><postSubmissi
on xmlns="http://service.arm.hud.gov/"><submissionHeader><agcHcsId>1</agcHcsId><
agcName>test</agcName><systemName>123</systemName><cmsSubId>123456</cmsSubId><su
bFlag>0</subFlag></submissionHeader><agcType>test</agcType><submissionData>PD94b
WwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0idXRmLTgiPz48dG5zOlN1Ym1pc3Npb25EYXRhIHhzaTpzY
2hlbWFMb2NhdGlvbj0iaHR0cDovL2dvdi5odWQuYXJtL2FybV9kYXRhYmFnXzNfMS54c2QiIHhtbG5zO
nhzaT0iaHR0cDovL3d3dy53My5vcmcvMjAwMS9YTUxT......etc............................
</submissionData></postSubmission></soap:Body></soap:Envelope>

Here the xml data for better readability:

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http:/
    /schemas.xmlsoap.org/soap/envelope/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema
    -instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <soap:Body>
        <postSubmissi on xmlns="http://service.arm.hud.gov/">
            <submissionHeader>
                <agcHcsId>1</agcHcsId>
                <agcName>test</agcName>
                <systemName>123</systemName>
                <cmsSubId>123456</cmsSubId>
                <subFlag>0</subFlag>
            </submissionHeader>
            <agcType>test</agcType>
            <submissionData>PD94b
                WwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0idXRmLTgiPz48dG5zOlN1Ym1pc3Npb25EYXRhIHhzaTpzY
                2hlbWFMb2NhdGlvbj0iaHR0cDovL2dvdi5odWQuYXJtL2FybV9kYXRhYmFnXzNfMS54c2QiIHhtbG5zO
                nhzaT0iaHR0cDovL3d3dy53My5vcmcvMjAwMS9YTUxT......etc............................
            </submissionData>
        </postSubmission>
    </soap:Body>
</soap:Envelope>
6
I think it's the unicode byte order mark. Can you paste how you feed the stream to the sax parser exactly? Paste some code if it's possible.ordnungswidrig

6 Answers

4
votes

I was able to get rid of the problem by removing the UTF encoding. Sounds like in both of our cases, the text wasn't actually UTF-8 encoded.

3
votes

Do you have a Byte Order Mark at the top of your file that is causing this confusion? Dump or edit the file with a hex dump or hex editor and check the first two or three bytes and ensure that the file starts with

0
votes

I just took your message, indented it and it validates:

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope 
    xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <soap:Body>
        <postSubmission xmlns="http://service.arm.hud.gov/">
            <submissionHeader>
                <agcHcsId>1</agcHcsId>
                <agcName>test</agcName>
                <systemName>123</systemName>
                <cmsSubId>123456</cmsSubId>
                <subFlag>0</subFlag>
            </submissionHeader>
            <agcType>test</agcType>
            <submissionData>
            PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0idXRmL *abbreviated data to fit*
            </submissionData>
        </postSubmission>
    </soap:Body>
</soap:Envelope>

I would guess that something just ahead of this message is out of sync somehow -- the error message appears to indicate that the XML parser thinks it's seeing a comment before it sees the <?xml line.

0
votes

Just open your XML file with hex editor like hexplorer and then you can be able to see and delete this strange chars, save your file and open with your preferred editor (personally I use Notepad++) and be sure that your file is using UTF-8 encoding :-)

Hope this will help you

0
votes

I've had no end of problems with C#/Java XML interoperability and Java's handling of the byte order mark (2 or 3 bytes preceeding the XML declaration and identifying the encoding byte order). Java doesn't play nicely with a valid BOM, so you'll have to remove it. Check for it by deriving a byte array and using:

arr[0] == (byte) 0xEF && arr[1] == (byte) 0xBB && arr[2] == (byte) 0xBF

This checks for the 3 byte variant, which is what causes grief :-(

0
votes

Does the web service operate correctly with other (java) clients? I've received this error quite a few times, and the source of the error was problems with library dependencies - if I remember correctly, something with JAXB2 in connection with Java5.