0
votes

I have input complex big XML files for the Mule flow.

File end point-> Byte Array to String -> Splitter -> ....

I have got org.xml.sax.SAXParseException: Content is not allowed in prolog when I try to process input files by using Splitter component. When I create new xml file and copy content of original file to the file, input files are processed. I delete BOM marker when I create new file. Original file has EF BB BF since the beginning of the file, local file has not.

Mule config:

<?xml version="1.0" encoding="UTF-8"?>
<mule xmlns:tracking="http://www.mulesoft.org/schema/mule/ee/tracking"    
xmlns:mulexml="http://www.mulesoft.org/schema/mule/xml"
xmlns:doc="http://www.mulesoft.org/schema/mule/documentation"
xmlns:spring="http://www.springframework.org/schema/beans" version="EE-3.4.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.mulesoft.org/schema/mule/file    
http://www.mulesoft.org/schema/mule/file/current/mule-file.xsd
http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans 
current.xsd
http://www.mulesoft.org/schema/mule/core http://www.mulesoft.org/schema/mule/core/current/mule.xsd
http://www.mulesoft.org/schema/mule/xml http://www.mulesoft.org/schema/mule/xml/current/mule-xml.xsd
http://www.mulesoft.org/schema/mule/ee/tracking    
http://www.mulesoft.org/schema/mule/ee/tracking/current/mule-tracking-ee.xsd">

<mulexml:dom-to-xml-transformer name="domToXml"/>

<flow name="SplitterFlow1" doc:name="SplitterFlow1">
<file:inbound-endpoint path="D:\WORK\Input"
moveToDirectory="D:\WORK\Output"
responseTimeout="10000" doc:name="File" fileAge="200" encoding="UTF-8"/>
<byte-array-to-string-transformer doc:name="Byte Array to String" />
<splitter evaluator="xpath" expression="/Invoices/invoice"
doc:name="Splitter"/>
<transformer ref="domToXml" doc:name="Transformer Reference"/>
    <tracking:custom-event event-name="Invoice ID" doc:name="Custom Business event">
    </tracking:custom-event>
<logger level="INFO" doc:name="Logger"/>
<file:outbound-endpoint path="D:\WORK\Output"
outputPattern="#[function:dateStamp:dd-MM-yyyy-HH.mm.ss]-#[header:OUTBOUND:MULE_CORRELATION_SEQUENCE]"
responseTimeout="10000" doc:name="File"></file:outbound-endpoint>
</flow>
</mule>

Please advise me how I can do it in the Mule flow. Thank you in advance.

3
Add you config for better understanding.user1760178
Mule config was addeduser3042795

3 Answers

0
votes

U can add before splitter an Java transformer with class:

package importxmltoapis;
import org.mule.api.MuleMessage;
import org.mule.api.transformer.TransformerException;
import org.mule.transformer.AbstractMessageTransformer;

public class DeleteBOM extends AbstractMessageTransformer{
public static final String BOM = "\uFEFF";

@Override
public Object transformMessage(MuleMessage message, String outputEncoding)
		throws TransformerException {

	String s="";
	try {s = removeBOM(message.getPayloadAsString());} catch (Exception e) {e.printStackTrace();}

	return s;
	}

	private static String removeBOM(String s) {
	    if (s.startsWith(BOM)) {
	        s = s.substring(1);
	    }
	    return s;
	}
}
0
votes

It's a pretty old post but here is my contribution.

Additionaly to the Java transformer approach suggested by @alexander-shapkin, I strongly recommend that you use Apache Commons' org.apache.commons.io.BOMInputStream to handle BOM marker out-of-the-box. The code would look something like below:

import java.io.InputStream;

import org.apache.commons.io.ByteOrderMark;
import org.apache.commons.io.IOUtils;
import org.apache.commons.io.input.BOMInputStream;
import org.mule.api.MuleMessage;
import org.mule.api.transformer.TransformerException;
import org.mule.transformer.AbstractMessageTransformer;

public class DeleteBOM extends AbstractMessageTransformer {

@Override
public Object transformMessage(MuleMessage message, String outputEncoding)
        throws TransformerException {

    try (InputStream in = new BOMInputStream(IOUtils.toInputStream(message.getPayloadAsString()), ByteOrderMark.UTF_8)) {       
        return IOUtils.toString(in);
    } catch (Exception e) {
        throw new RuntimeException("Could not remove BOM marker");
    }
}

}

I partially reproduced your Mule app with the following configuration:

    <file:connector name="File" autoDelete="false" streaming="true" validateConnections="true" doc:name="File" />
    <mulexml:dom-to-xml-transformer name="DOM_to_XML" doc:name="DOM to XML"/>
    <flow name="lalaFlow">
        <file:inbound-endpoint path="D:\WORK\Input" moveToDirectory="D:\WORK\Output" responseTimeout="10000" doc:name="File" fileAge="200" encoding="UTF-8"/>
        <component class="org.mule.bom.DeleteBOM" doc:name="Java"/>
        <transformer ref="DOM_to_XML" doc:name="Transformer Reference"/>
        ...
    </flow>

For further reference, go to https://commons.apache.org/proper/commons-io/javadocs/api-2.2/org/apache/commons/io/input/BOMInputStream.html

-1
votes

Try the following

1.Use the file to string transformer instead of bytearray to string transformer .

2.Check if you big xml is read completely and if not use the file age property of the file endpoint which will enable you to read your large file completely.