Can I make SAXParser.parse() return before it has parsed an entire InputStream?

Question

I'm trying to parse a never-ending xml stream that looks like this:

<outer>
  <foo>footext</foo>
  <bar>bartext</bar>
</outer>
<outer>
  <foo>footext</foo>
  <bar>bartext</bar>
</outer>
<outer>
  <foo>footext</foo>
  <bar>bartext</bar>
</outer>
...

I've got my SAXParser set up:

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(true);
SAXParser parser = factory.newSAXParser();

And I can call it with an InputStream and my implementation of Defaulthandler easily enough:

parser.parse(theInputStream, myHandler);

The problem is that I need the parser.parse to actually return after it hits a </outer> end tag so that I can return the object I parsed out of the xml. The reason for this is that my parsing code (in a class called XMLParser) is called in a loop like this:

while(condition) {
    Object o = xmlParser.getNextObject();
    ... do something with the object ...
}

Is it possible to make a SAXParser return from a parse(InputStream, DefaultHandler) call before it has read the entirety of the available stream?

Paul Grime Paul Grime · Accepted Answer · 2011-09-10T20:00:16

This IBM article uses a technique of throwing an exception. Normally you shouldn't use Exceptions to control program flow, but this might be the simplest way for you.

I wouldn't recommend using a vanilla SAXException though. Create a subclass like DataOfInterestFoundException and throw that. Then your code that calls the SAX parse can catch this type of exception to stop normally, and treat everything else as a genuine exception.

Another solution might be to wrap the input stream with one of your own that somehow has the ability to persuade the SAX parser to stop. Not sure you can do this though.

Can I make SAXParser.parse() return before it has parsed an entire InputStream?

1 Answers