Ruby, Ruby on Rails, Artificial Intelligence and something more ...

Ver y Cocinar, descubre la cocina y disfrútala a simple vista

RailsConf Europe 2007 Speaker

RailsConf Europe 2007 Speaker
http://www.railsconfeurope.com/

Spejman Thumblog!

Monday, 14 May 2007

Backup of Bloglines keep new items

If you are a Bloglines user, you probably save the interesting posts using "keep new" feature.

If you have use this reader for a time, the quantity of post saved could be large... Some day I thought that won't be funny to loose all this data, then I build a ruby script to backup bloglines "keep new" items into a xml file.

With this script I learned better Mechanize and Hpricot Ruby libraries.
Este script me ha servido para probar dos librerías muy útiles de Ruby: mechanize y hpricot.

You need these libraries in order to use the script:


gem install json
gem install activesupport
gem install hpricot
gem install mechanize


And here you have the script. I think is very understandable. I hope it helps you to backup your bloglines account (remember to change EMAIL and PASSWORD values with yours) or to understand better how mechanize and hpricot works.

require "rubygems"
require "hpricot"
require "json"
require "mechanize"
require "active_support"

# Reads a bloglines javascript tree structure that has all
# feeds data.
def read_tree( tree_base, label = "" )
tree_base.each do |tree|
if tree["kids"]
read_tree tree["kids"], label + "/" + tree["n"]
else
@feeds << [tree["n"], label, tree["kn"], "http://www.bloglines.com/myblogs_display?sub=#{tree["id"]}&site=#{tree["sid"]}"]
end
end
end

# Add more memory to hpricot otherwise couldn't load some webs.
Hpricot.buffer_size = 262144

agent = WWW::Mechanize.new
page = agent.get 'http://www.bloglines.com/login'

form = page.forms[1]
form.email = 'EMAIL'
form.password = 'PASSWORD'

page = agent.submit form

# Get the bloglines sindicated feeds
menu_page = agent.get "http://www.bloglines.com/myblogs_subs"
start_text = "var initTreeData = "
end_text = "\n;\n"
js_feeds_tree_str = menu_page.content[menu_page.content.index(start_text)+start_text.size..menu_page.content.index(end_text)]
feeds_tree = JSON.parse js_feeds_tree_str.gsub("\\","")
@feeds = []
read_tree(feeds_tree["kids"])

puts "<bloglines_saves>"
@feeds.each do |feed|

page = agent.get feed[3]
doc= Hpricot(page.content)

# get the content of all saved feed posts
content = ((doc/"body")/"td.article")
next if content.empty?
puts "<feed name=\"#{feed[0].strip}\" folder=\"#{feed[1].strip}\">"

# Iterate each saved feed post
((doc/"body")/"a.bl_itemtitle").each_with_index do |title, index|
puts "<feed_save title=\"#{title.inner_html.strip}\" href=\"#{title.attributes["href"]}\">"
puts content[index].inner_html.to_xs
puts "</feed_save>"
end
puts "</feed>"

end
puts "</bloglines_saves>"

Más información:

4 comments:

admin said...

Thanks, great solution, works like a charm!

mikehedge said...

I want to export my "keep New" items from Bloglines.

what do I do? do I install Ruby?

Mcafee activate said...

www.mcafee.com/activate
mcafee livesafe retail card
mcafee activation code
www.mcafee.com/activate
roku.com/link

Roku & Echo Device said...

If your echo won't connect to wifi Or having any problem to setup Alexa to Wifi then Don't Worry we are here to help you just follow the simple steps which are given on our website. We'll help you to connect Alexa to wifi, connect echo to wifi and amazon echo not connecting to wifi and other problems. For instant help, call us at our amazon echo help number +1-888-745-1666.
Setting Up Amazon Dot
Connect Echo to Wi-Fi

About Me

Barcelona, Barcelona, Spain