4
votes

I am new to XML parsing using Java and SAX parser. I have a really big XML file and because of its size I have been advised to use SAX parser. I have finished parsing part of my tasks and it works as expected. Now, there is one task left with XML job: deleting/updating some nodes upon user's request.

I am able to find all tags by their names, change their data attributes, etc. If I am able to do these with SAX, deleting also may be possible.

Sample XML describes some functionality under some case's. User's inputs are the "case"s names (case1, case2).

<ruleset>
    <rule id="1">
        <condition>
            <case1>somefunctionality</case1>
            <allow>true</allow>
        </condition>
    </rule>
    <rule id="2">
        <condition>
            <case2>somefunctionality</case2>
            <allow>false</allow>
        </condition>
    </rule>
</ruleset>

If user wants to delete one of these cases (for example case1) not just case1 tag, the complete rule tag must be deleted. If case1 is to be deleted, XML will become:

<ruleset>
    <rule id="2">
        <condition>
            <case2>somefunctionality</case2>
            <allow>false</allow>
        </condition>
    </rule>
</ruleset>

My question is, can this be done using SAX? I can't use DOM or any other parser at this point. Only other option is even worse: string search. How can it be done using SaxParser?

3

3 Answers

6
votes

Try as

    XMLReader xr = new XMLFilterImpl(XMLReaderFactory.createXMLReader()) {
        private boolean skip;

        @Override
        public void startElement(String uri, String localName, String qName, Attributes atts)
                throws SAXException {
            if (qName.equals("rule")) {
                if (atts.getValue("id").equals("1")) {
                    skip = true;
                } else {
                    super.startElement(uri, localName, qName, atts);
                    skip = false;
                }
            } else {
                if (!skip) {
                    super.startElement(uri, localName, qName, atts);
                }
            }
        }

        public void endElement(String uri, String localName, String qName) throws SAXException {
            if (!skip) {
                super.endElement(uri, localName, qName);
            }
        }

        @Override
        public void characters(char[] ch, int start, int length) throws SAXException {
            if (!skip) {
                super.characters(ch, start, length);
            }
        }
    };
    Source src = new SAXSource(xr, new InputSource("test.xml"));
    Result res = new StreamResult(System.out);
    TransformerFactory.newInstance().newTransformer().transform(src, res);

output

<?xml version="1.0" encoding="UTF-8"?><ruleset>
    <rule id="2">
        <condition>
            <case2>somefunctionality</case2>
            <allow>false</allow>
        </condition>
    </rule>
</ruleset>
0
votes

What you need to construct is a SAX event buffer.

when you come accros a <rule> element, you need to save it (or the information required to regenerate it) and all of the other event that occur between it and your the 'case' you want to delete.

If the 'rule' you have saved is the same as the one that needs to be deleted, just throw out the info and continue.

If the 'rule' you saved is not the one that needs to be deleted, you should regenerate the sax events that were saved and the continue.

0
votes

SAX is most commonly used for reading/parsing XML. But there is an article on how to use SAX to write files. And it appears that chapter is available online - see:

http://xmlwriter.net/sample_chapters/Professional_XML/31100604.shtml

[The article is dated 1999 so it's using an old version of SAX, but the concepts still apply]

The basic idea is you create a custom DocumentHandler/ContentHandler. Whenever it receives a SAX event it serializes and writes the event to a stream/file/whatever. So you use your input document as a source of sax events and forward these events to the XMLOutputter.

The hard part is getting to the point where your can parse your XML document into a stream of SAX events, drive the XMLOutputter and generate an exact copy of the input file. Once you get that working, you can move onto the editing logic where you read your rules and use these to modify the output file.

It's a lot more work than DOM, JDOM, XSLT etc, but it may help in your situation because you never have to store the entire document in memory.