0
votes

I have a 12mb XML file which I am accessing from within an xquery. The file is loaded something like this;

let $t := doc('file:///C:/foo/bar/file12mb.xml')

The code is taking about 950ms to execute.

How can the XML document be loaded faster? Once the xml file is loaded and parsed, the body of the xquery only takes a few milliseconds to run, so I'm trying to speed up the initial loading and parsing of the xml file which is taking the majority of the execution time.

Is there any way for Saxon to persist an xml document after it has been parsed? Ideally I would like to persist the xml data file somehow but Saxon seems to be designed purely as an xml processor not an xml database.

Would a Schema help? The xml file does not currently have a schema associated with it. The Saxon documentation implies that having a schema speeds up query execution but slows down the initial loading and parsing of the xml data, so I haven't tried creating a schema.

Any suggestions gratefully received.

Versions

java version "1.6.0_26"
Saxon-B version 9.1.0.8
1
Probably you should consider using XML databases like Sedna, eXist? As you mentioned Saxon is processor, not a storage.Shcheklein

1 Answers

0
votes

That sounds pretty fast for parsing a 12MB file. I don't think you can optimize that, and no, Saxon is not a database.

In MarkLogic the parsing of the XML only ever happens once: during ingest. In other databases, such as Oracle, that may or may not be the case, depending on how you load it.