1
votes

I'm parsing various site feeds, and putting together a small library to help me do it.

Looking at the Atom RFC and RSS 2.0 specification, feeds from Twitter seem to be a combination. Twitter specifies an Atom namespace in an RSS 2.0 structure?

GitHub uses Atom, whereas Flickr (offers multiple but the default 'Latest' feed from user profiles) appears to be RSS 2.0.

How can Twitter specify a Atom namespace and then use RSS?

This makes parsing feeds a little ambiguous, unless I ignore any specified namespace and just examine the document structure.

2

2 Answers

3
votes

Twitter does not actually specify the namespace for RSS at all, but it's actually RSS. It specifies the namespace for Atom only because it uses some atom elements internally (specifically it uses an atom:link to refer back to the URL of the feed.

Notice that the xmlns for atom has a prefix specified, which means that only those elements with the atom: prefix in the document are from that namespace.

The simplest method for disambiguating practically is to just look at the root element. If it's Atom, it'll be feed. If it's RSS, it'll be rss.

1
votes

It's definitely RSS. For one, Atom feeds don't use channel or item. In light of this, in regards to the specification, you can rule out Atom. And I have a hunch it's not against the RSS spec to declare a namespace.

Regardless, you shouldn't have to worry about parsing feeds yourself; get a parser to do the work for you.