Part of an app I'm building needs to check RSS feeds for updates. I'm looking for a reliable way to know if a feed has new entries.
I know that sometimes people make posts to the future and, after that, posts to the present time which could cause some entries to be hidden. It seems like there could be more complications than that, as well. I also know that hashing the title or content would result in poor performance and unreliable results since those can change and are not a sign of new entries. And I know that a few years ago when I was maintaining a podcast RSS feed manually I never changed the item.
So, I need some way to reliably check RSS, Atom, etc feeds for new entries since they were lasted checked.
Specifically, this application will be written in Python for Google App Engine using Universal Feed Parser, but I doubt that matters too much in this case.