0
votes

I'm trying to use the standard lib ruby RSS::Parser to parse an Atom feed, which sort of works.

When I access the extracted fields, such as .title it returns <title>The title</title> rather than just The title. If you parse e.g. a RSS feed the .channel.title will return The title.

Is there any way to use the standard RSS::Parser for atom feeds? or is it a bug?

I know there are alternatives like Feedzirra, but I would rather use the standard lib.

A quick test to see the problem in ruby 1.9.3 and 2.0:

require "rss"
feed = RSS::Parser.parse(open("http://casadelkrogh.dk/atom.xml").read)
feed.title.to_s #=> "<title>CasaDelKrogh</title>"
2

2 Answers

3
votes

To get the content of the title your code should be as below :

require "rss"
feed = RSS::Parser.parse(open("http://casadelkrogh.dk/atom.xml").read)
feed.title.to_s
# => "<title>CasaDelKrogh</title>"
feed.title.content
# => "CasaDelKrogh"
2
votes

It's not a bug.

to_s method is almost inspection of RSS::Atom::Feed::Title.

You can use feed.title.content if you want get title without tag.