I saw Archive Flagged Items from NetNewsWire into Yojimbo (via) and found it a really neat tool. I've started using it to keep a more permanent record of interesting new articles.
I have lots of news articles flagged and I've had the import crash on me before for various reasons which has resulted in some duplicate web archives being created. The reason for the duplicates is usually because there was a redirect to the actual article that Yojimbo followed, so the URL in NetNewsWire doesn't match the Yojimbo URL and checking for existing items on load doesn't work.
This was a great opportunity for me to explore RubyOSA and continue learning Ruby so I wrote a script to detect duplicate web archives based on name and mark them with a "Duplicate" label.
-
#!/usr/local/bin/ruby
-
-
require 'rubygems'
-
require 'rbosa'
-
-
yojimbo = OSA.app('Yojimbo')
-
-
seen = Hash.new(false)
-
-
# Duplicate label details
-
duplicateLabelName = 'Duplicate'
-
# An almost painful red
-
duplicateLabelColor = [65535, 1536, 4628]
-
duplicateLabel = nil
-
-
# See if we already have a label named 'Duplicate' and save it
-
yojimbo.labels.map { |l| l.name == duplicateLabelName && duplicateLabel = l }
-
-
# Make a label if we don't have it already
-
if (duplicateLabel.nil?)
-
duplicateLabel = yojimbo.make(OSA::Yojimbo::Label,
-
nil,
-
:color => duplicateLabelColor,
-
:name => duplicateLabelName)
-
end
-
-
yojimbo.web_archive_items.each do |f|
-
if (seen.include?(f.name))
-
puts "Found a duplicate: #{f.name}"
-
f.label= duplicateLabel
-
seen[f.name].label= duplicateLabel
-
else
-
seen[f.name] = f
-
end
-
end

Entries (RSS)