4
votes

I've got an issue with an ATOM feed that I am generating and the entries being duplicated in readers.

I've developed a C# class for creating ATOM feed entries, and ultimately a full feed object from my data elements. It conforms to the RFC 4287 for the Atom Syndication Feed Format.

The feed is generated properly. It doesn't currently validate since I'm adding two non standard extension elements and haven't yet created a proper namespace for it, but this issues exists even when it validates.

I'm also seeing two different behaviors between my direct Atom feed, and the FeedBurner feed.

Here are the two feeds. http://feeds.feedburner.com/oldtownhome/ http://www.oldtownhome.com/index.atom

I've subscribed to both feeds via Google Reader, and have reports from other users that they see the same issue, but it is not consistent as to when they see the issue vs me.

Issue with feed #1: Feed items that are currently in the feed (25 entry feed) duplicate at random. This can be as single duplicates during one day, multiple item duplicates over several days, or sometimes the entire feed contents appears to be "republished" during a single day, even when they already exist in previous days.

Issue with feed #2: Posts seem to duplicate at random, even when the entries fall from the main feed (though they may be available via other Atom feeds on the site, like individual category Atom feeds).

I've run through everything I can think of. I've ensured the pubDate never changes, each link to the document is unchanging from the time of publish, added a node with the post's actual and unchanging guid as the value, but nothing seems to help. I even forced feedburner to use my XML in hopes that the issue may have been in feedburner.

I'm at a loss and am hoping others have had a similar situation and have some advice.

Update A possibly related item, my feedburner delivered email that came through yesterday with the "Latest Posts" reported 25 new posts (the total number in the main feed where it derives this information). Of these 25 posts, 24 were not new and had been delivered one at a time for the previous 30 days or so prior. Only 1 post was new and it was at the top lumped with the rest of the messages.

Is it possible this has to do with some connectivity issue where Feedburner isn't able to access my feed (because it's down or something) and then when it is back online Feedburner things the entire contents are new? I've not had any extended outages with my server for over a year, but have had issues that have possibly lasted 30 seconds to 5 minutes.

This is the most frustrating issue because Feedburner/Google Reader are both such black boxes.

1

1 Answers

2
votes

If anyone stumbles on this random post when looking for duplicate Google Reader post solutions, I think I may have discovered the root cause of the issue, and it's annoying.

The blog had many many atom feeds, but only one primary. The primary feed listed the 25 most recent posts at any given time, but beyond that feed, there were other discoverable feeds listed in the meta data of the content. These included category feeds, comment feeds, page specific feeds, popular posts feeds, etc. It seems that Google Reader, in all of its wisdom, was crawling all of these feeds and treating them all as absolutely distinct feeds and items, even though they all had the same unique id for the posts (url for the post). Once I removed all of these as discoverable feeds, and also made sure to redirect the main non www feed to the one with www, to ensure all urls are unique and not duplicated, all seems to be well with the world and google reader is no longer duplicating the content.

Well, that was many months of annoying and frustrating items trying to troubleshoot a service that has absolutely not ability to debug or provide useful information for a developer.

I hope this helps someone...someday.