I am writing an application in Java for Android (SDK v8) that parses XML and puts the entries into a ListView. This part works fine. I am parsing the XML with a DocumentBuilder, which is terminating the strings it's outputting after an entity - excluding the entity itself. The entities I am using are standard entites &(quot, amp, apos, lt, gt); I have also tried using numeric entities in my source XML (e.g. &# 38; without the space, just so you can see what I'm outputting) and this leads to a crash of my app, with logcat reporting "unterminated entity ref".
To test that I am not using invalid XML, I have tried viewing the XML with Google Chrome - which displays it perfectly. The entry blah & blah.txt
is truncated to blah
. The XML I am parsing is below:
EDIT: Much shorter XML sample
<?xml version="1.1"?>
<root>
<object>
<id>ROOT</id>
<type>directory</type>
<name>../</name>
</object>
<object>
<id>09F010C143B84573A36C50F3EF7E0708</id>
<type>file</type>
<name>blah & blah.txt</name>
</object>
<object>
<id>85CF028B838D4E0096C081B987C97045</id>
<type>file</type>
<name>Epilist.m3u</name>
</object>
</root>
EDIT: XML parsing class EDIT2: Below is a complete class that (with the help of others) should now be bug free. Anyone is welcome to use this class - I am providing it as Public Domain code. You do not need to reference that I originally produced this code to use it. It is designed for Android, but by replacing references to 'Log.e' it can easily be used on any Java platform as far as I know.
package tk.dtechsoftware.mpclient;
import java.io.IOException;
import java.io.StringReader;
import java.io.UnsupportedEncodingException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.util.EntityUtils;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import android.util.Log;
public class XMLParser {
public String getXmlFromUrl(String url) {
String xml = null;
try {
// defaultHttpClient
DefaultHttpClient httpClient = new DefaultHttpClient();
HttpGet httpGet = new HttpGet(url);
// HttpResponse httpResponse = httpClient.execute(httpPost);
HttpResponse httpResponse = httpClient.execute(httpGet);
HttpEntity httpEntity = httpResponse.getEntity();
xml = EntityUtils.toString(httpEntity);
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
} catch (ClientProtocolException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
// return XML
return xml;
}
public Document getDomElement(String xml) {
Document doc = null;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setExpandEntityReferences(false);
try {
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(xml));
doc = db.parse(is);
} catch (ParserConfigurationException e) {
Log.e("Error: ", e.getMessage());
return null;
} catch (SAXException e) {
Log.e("Error: ", e.getMessage());
return null;
} catch (IOException e) {
Log.e("Error: ", e.getMessage());
return null;
}
// return DOM
return doc;
}
public String getValue(Element item, String str) {
NodeList n = item.getElementsByTagName(str);
return n.item(0).getTextContent();
}
}
<name>
ofblah & blah.txt
– StereoRocker