1
votes

Having difficulty using Saxon to transform multiple input files from hocr to alto xml (need an xml output for every input file) I’ve been able to transform one file at a time but get error message ‘Source file collection(‘file:\c:?select=*.(hocr)) does not exist’ when I try using collection() command for multiple files. So I know it’s an issue with the path I’m using but not sure what the correct use of the collection() should be in this case. Any help would be appreciated. The full command I’ve been trying is:

java -cp saxon-he-10.1.jar net.sf.saxon.Transform -t -s:collection(‘file:///c:/?select=*.(hocr)) -xsl:hOCR-to-ALTO-master\hocr__alto2.0.xsl -o:SaxonHE10-1J

Also tried

java -cp saxon-he-10.1.jar net.sf.saxon.Transform -t -s:collection(‘file:///c:/?select=_*.hocr) -xsl:hOCR-to-ALTO-master\hocr__alto2.0.xsl -o:SaxonHE10-1J

1

1 Answers

1
votes

The -s option on the command line expects a file name or URI, not an XPath expression.

If you want to call the collection() function it has to be within an XPath expression, typically in the stylesheet (though it could also be in a stylesheet parameter set from the command line using ?param=collection('file:///c:/?select=*xml').

Note also that the argument to the collection function is a URI, not a Windows file name, and URIs never contain a backslash. The select parameter is a "glob" typically in the form select=*.xml. I've no idea what you intended with select=*.(hocr).