1
votes

I am trying to reference several entities such as ampersands, hyphens, etc. To do that, I already have a URL where an .ent document is deposited. Unfortunately, I have no DTD document and assuming that there is problem because of that (although all information I need is provided in the ent document).
The URL looks like this:
<!-- 
Copyright and Table of Contents (several blocks)
-->
<!-- ==================================================================== -->
<!-- 
Comment: first block
-->
<!ENTITY xxxx            "xxxx" ><!--comment-->
(...)
<!-- 
Comment: second block
-->
<!ENTITY xxxx            "xxxx" ><!--comment-->
(...)

So, the entities are listed in the document. My XML refers to this like this:

<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE NAME_OF_THE_CREATOR_AND_VERSION [
<!ENTITY % NAME_OF_THE_CREATOR_AND_VERSION.ent SYSTEM "URL"> %NAME_OF_THE_CREATOR_AND_VERSION.ent; ]>
(...)

I have added the underlines in order to enhance readability.

So far, I have

  • added standalone="no"
  • reduced the reference to <!DOCTYPE NAME SYSTEM "URL">
  • added the entities in the XML (although I would strongly prefer to find a way to keep an external reference since the .ent document comprises a lot of entities).
    The latter looks like this:
<?xml version="1.0" encoding="US-ASCII" standalone="no"?>
<!DOCTYPE entities [
   <!ENTITY hyphen "&#x02010;">
   <!ENTITY copy "&#x000A9;">
   <!ENTITY nbsp "&#x000A0;">
   <!ENTITY ndash "&#x02013;">
   <!ENTITY auml "&#x000E4;">
   <!ENTITY uuml "&#x000FC;">
   <!ENTITY deg "&#x000B0;">
   <!ENTITY Delta "&#x00394;">
   <!ENTITY minus "&#x02212;">
   <!ENTITY ensp "&#x02002;">
]>

I have checked the XML syntax in BaseX, notepad++ and the XML Validator by w3schools and everytime (except for the internal declaration) it says that the entity (e.g. hyphen) was referenced but not declared or that there is an error in the Markup declaration. Meanwhile, when I checked the ent file it said that there is content added at the end of the document, starting with the entity declarations, or that an invalid element name had been used - but this is probably due to it being an ent file?

Thanks in advance for any help and tipps.
Best,
Eleonore

EDIT: I am using the BaseX GUI on Windows 10. I have enabled "Parse DTDs and Entities" as well as the internal XML parser and the chopping of whitespaces.

1

1 Answers

0
votes

I think you want

<!DOCTYPE root-element-name [
<!ENTITY % myentities SYSTEM
   "myentities.ent">
%myentities;
]>
<root-element-name>&foo; &bar;</root-element-name>

As for BaseX, I think its default option disables external entities https://docs.basex.org/wiki/Options#DTD so you need to set that to true to be able to use the external entity e.g. basex.bat -c "SET DTD true" -i input.xml query.xq.