0
votes

How can select and modify the tag <Tagwith.dot> with some other text using beautifulsoup? If its not possible with beautifulsoup then what is the next best library for xml document edit and creation, would be lxml?

from bs4 import BeautifulSoup as bs

stra = """
<body>
<Tagwith.dot>Text inside tag with dot</Tagwith.dot>
</body>"""
soup = bs(stra)

Desired XML:

<body>
<Tagwith.dot>Edited text</Tagwith.dot>
</body>
2

2 Answers

2
votes

BS4 assumes and converts all the tags to lower case. The below code works fine. Provide the tag name in lower case.

from bs4 import BeautifulSoup as bs

stra = """
<body>
<Tagwith.dot>Text inside tag with dot</Tagwith.dot>
</body>"""
soup = bs(stra, 'html.parser')

print(soup.find_all('tagwith.dot'))

Output:

[<tagwith.dot>Text inside tag with dot</tagwith.dot>]
1
votes

You can use xml.etree.elementtree to achieve what you want as follows

import xml.etree.ElementTree as ET

stra = """
<body>
<Tagwith.dot>Text inside tag with dot</Tagwith.dot>
</body>"""

#Read xml string and convert to xml object
xml_obj = ET.fromstring(stra)

#Iterate through elements
for elem in xml_obj:
    #If tag is found, modify the text
    if elem.tag == 'Tagwith.dot':
        elem.text = 'Edited text'

#Print updated xml object as a string
print(ET.tostring(xml_obj).decode())

The output will be

<body>
<Tagwith.dot>Edited text</Tagwith.dot>
</body>