I'm using python to build an application which functions in a similar way to an RSS aggregator. I'm using the feedparser library to do this. However, I'm struggling to get the program to correctly detect if there is new content.
I'm mainly concerned with news-related feeds. Besides seeing if a new item has been added to the feed, I also want to be able to detect if a previous article has been updated. Does anybody know how I can use feedparser to do this, bearing in mind that the only compulsory item elements are either the title or the description? I'm willing to assume that the link element will always be present as well.
Feedparser's "id" attribute associated with each item seems to simply be the link to the article so this may help with detecting new articles on the feed, but not with detecting updates to previous articles since the "id" for those will not have changed.
I've looked on previous threads on stackoverflow and some people have suggested hashing the content or hashing title+url but I'm not really sure what that means or how one would go about it (if indeed it is the right approach).