4
votes

I want to create a RSS Feed generator. The generator is supposed to read data from my database and generate RSS feeds for them. I want to write a script that would automatically generate the xml file for validation.

So if a typical xml file for RSS feed looks like this:

  <item>
  <title>Entry Title</title>
  <link>Link to the entry</link>
  <guid>http://example.com/item/123</guid>
  <pubDate>Sat, 9 Jan 2010 16:23:41 GMT</pubDate>
  <description>[CDATA[ This is the description. ]]</description>
  </item>

I want the script to replace automatically the field between the <item> and the </item> tag. Similarly for all the tags. The value for the tags will be fetched from the database. So I will query for it. My app has been designed in Django.

I am looking for suggestions as to how to do this in python. I am also open to any other alternatives if my idea is vague.

3
Please note this is not the valid syntax for CDATA section. - Sylvain Leroux
What would be the correct syntax? - Indradhanush Gupta
If you look carefully at my comment, you will see a link. Just click. - Sylvain Leroux
Why the votedown? It would help if an explanation was present. - Indradhanush Gupta

3 Answers

10
votes

Since this is python. Its good to be using PyRSS2Gen. Its easy to use and generates the xml really nicely. https://pypi.python.org/pypi/PyRSS2Gen

3
votes

What have you tried so far?

A very basic approach would be to use the Python DB API to query the database and perform some simple formatting using str.format (I type directly in SO -- so beware of typos):

>>> RSS_HEADER = """
<whatever><you><need>
"""
>>> RSS_FOOTER = """
</need></you></whatever>
"""

>>> RSS_ITEM_TEMPLATE = """
<item>
    <title>{0}</title>
    <link>{1}</link>
</item>
"""


>>> print RSS_HEADER

>>> for row in c.execute('SELECT title, link FROM MyThings ORDER BY TheDate'):
        print RSS_ITEM_TEMPLATE.format(*row)

>>> print RSS_FOOTER
1
votes

This rough code uses a very useful library called BeautifulSoup. It serves as a parser for HTML/XML markup languages providing a python object model. It can be used to extract/change information from XML or create XML by building up the object model.

This code below works by changing XML tags from a the template on pastebin. It does this by copying the template "item" tag, clearing its child tags and filling them with the suitable dictionary entries which will be presumably sourced from a database.

# Import System libraries
from copy import copy
from copy import deepcopy

# Import Custom libraries
from BeautifulSoup import BeautifulSoup, BeautifulStoneSoup, Tag, NavigableString, CData

def gen_xml(record):

    description_blank_str = \
    '''
    <item>
    <title>Entry Title</title>
    <link>Link to the entry</link>
    <guid>http://example.com/item/123</guid>
    <pubDate>Sat, 9 Jan 2010 16:23:41 GMT</pubDate>
    <description>[CDATA[ This is the description. ]]</description>
    </item>
    '''
    description_xml_tag = BeautifulStoneSoup(description_blank_str)

    key_pair_locations = \
    [
        ("title", lambda x: x.name == u"title"),
        ("link", lambda x: x.name == u"link"),
        ("guid", lambda x: x.name == u"guid"),
        ("pubDate", lambda x: x.name == u"pubdate"),
        ("description", lambda x: x.name == u"description")
    ]

    tmp_description_tag_handle = deepcopy(description_xml_tag)

    for (key, location) in key_pair_locations:
        search_list = tmp_description_tag_handle.findAll(location)
        if(not search_list):
            continue
        tag_handle = search_list[0]
        tag_handle.clear()
        if(key == "description"):
            tag_handle.insert(0, CData(record[key]))
        else:
            tag_handle.insert(0, record[key])

    return tmp_description_tag_handle

test_dict = \
{
    "title" : "TEST",
    "link" : "TEST",
    "guid" : "TEST",
    "pubDate" : "TEST",
    "description" : "TEST"
}
print gen_xml(test_dict)