There are quite alot of posts on removing namespaces in Python, but nearly all use the lxml package. Which seems very nice, but I've had trouble implementing in Windows.
What I want to achieve with my tags is similar to to: removing namespace aliases from xml but that answer is oriented toward json and doesn't seem to be python-based.
Similarly, I'm unclear on how to implement this: https://stackoverflow.com/a/61786754/9249533
This older post is somewhat helpful: https://stackoverflow.com/a/18160058/9249533
But I'm wondering as of July 2021 what the options are?
My aim is to just access the data. I do not care to move it back to Excel.
presently, if I run:
from xml.etree import ElementTree as ET
tree = ET.parse(in_path + 'myfile.xml')
root = tree.getroot()
for child in root.iter():
print(child.tag)
returns this for the 'Data' tag:
{urn:schemas-microsoft-com:office:spreadsheet}Data
I'd like it to just be:
Data
The prefix/namespace is hampering my very modest xml interpretation skills. Any guidance for doing this with Packages that are more readily Windows compatible (or better still with conda installs) much appreciated.
xmlns=
namespaces, and a handful of other namespaces. I really don't know if they are necessary to be distinctive. So removing all of them may be considered a risk. But it could easily be achieved with XSLT. Even with version 1.0. – zx485