Posts tagged “autonomous”.

Open Data Day Hackathon – CitySpender

International Open Data Day Hackathon Logo
Today was the International Open Data Day Hackathon and I helped coordinate a group of people in Ann Arbor, MI to participate pretty last minute. There was a post to the School of Information’s (my alma mater) mailing list saying, effectively, “Short notice but this looks cool.” I replied with “Yes, short notice, but that doesn’t mean we can’t do anything!”

With a quick rallying of the troops, notably Eli Neiburger of the Ann Arbor District Library and Ryan Burns of a2geeks, we had a free space to have the event with free wifi and electricity (what else do you need?). I sent out the announcement that same day (Nov. 29th, 5 days ago). Even with the last minuteness, end of semesterness (Ann Arbor is a college town), and the holidays right around the corner we had a really good turn out. Including myself there were 8 people in attendance.

What we did:

This project has A) a name (CitySpender), B) a git repository, C) a license, D) and code that does stuff that isn’t yet checked into the repository :)

All in all I think this was a big success. Met some great new people who I’ll continue to work with on this project (we’ll be at the next Coffee House Coders in Ann Arbor this Wednesday night).

Expect more updates…..

Other Launchpad intalls in use?

I’ve been wondering about this for a while: Are there any other installations of (the wonderuflly AGPL’d) Launchpad in use by a project (or group of projects)? And, if so, are they doing any fancy distributed bug tracking awesomeness like the bug comment import feature with bugzilla? It would be great to see how other groups use the power of Launchpad in their own way.

What might even be more interesting is if a development company was using Launchpad as their internal dev platform (because they either didn’t want to pay Canonical to let them host a proprietary project or they simply wanted all their development to be hosted on their own servers); how are the features of LP either used or not used in that case?

Flickr Backup

As some of you are probably (way too) aware of, I like to backup my social data across the web (see what I do for backing up my google calendars). I actually dream for the day when there is an Ubuntu package I can install, give it my credentials to a few websites (which it saves in your keyring), and then it proceeds to create an initial backup of all your data across all your services. Why do this? Well, aside from the mantra of “keep your own backups!” in case of service malfunction (remember when gmail went down for a few hours? I do, people went crazy), there is also the personal desire to have the ability to migrate to a new service should I wish in the future. If I find a better photo sharing service for some reason, I want to migrate my data/photos to it easily.

Now, backing up flickr.

There are few very important pieces of information to backup from flickr which I can do right now: my photos and my stats (views/referrals of my photos).

Photos
Tool: FlickrTouchr
Why #1: Do you backup your ~/Photos directory? If your answer is “No” or “Infrequently” you might really like this when your harddrive crashes and you don’t have local copies of those awesome photos from your awesome vacation.

Why #2: Do you take photos with your cell phone and upload them directly to flickr? Do you then clear them off your phone because they take up valuable space? This will make sure you have a copy of those on your own machine for easy editing/backup (see #1).

What it does: This one does what it does very well. It authorizes itself with your flickr account and then proceeds to download all of your photos (including your private ones, hence needing to authorize). Also, if you use the Sets feature of flickr, it keeps those associations by creating directories with the sets’ names. So, my directory structure that flickrtouchr creates for my account looks like this:

greg@rose:~/backup/flickr/photos$ ls -1
Bike Lock Fail Blog
Bike Ride – 20090628
Botanical Garden, July 4th, 2009
Bug Jam
Favourites
Gettysburg Trip
Jaunty Release Party
Mackinac Island Trip
No Set
SF – 2008
touchr.frob.cache
Traverse City – December ’09
UDS Karmic

You’ll see the “No Set” directory, which is where all the photos that are NOT part of any set.

How:
If you are going to run this script manually and your local machine with a web browser, you’ll be just fine and just follow the instructions it gives you. If, however, you are like me and want to run this via a cron job on a regular basis, you’ll need to take an extra 2 steps.
1. Start it on your local computer and it will authorize itself via your browser.
2. Kill it (CTRL+C) so you don’t have to sit there and wait for it to finish downloading all of your photos.
3. Copy the touchr.frob.cache file to your server and put it in the folder you’re going to backup your photos to.

Now when it runs it will pick up your credential information from that file and run as expected. Put 5 23 * * * python /home/greg/src/scripts/flickrtouchr.py in your crontab and you always have a backup of your photos! Don’t worry about running it every night; if the photo is already downloaded it just skips it (ie: It does The Right Thing®).

COOL! Now you have your photos backed up!

Statistics
Tool: My flickr-stats-export.sh based on this unnamed Github Gist
Why #1: I like numbers and these are the “raw” CSV files that flickr is producing for your photos. It tells you how many times your photos are viewed and what the referrer was.
Why #2: The stats are going away!

What it does: Pretty simply, it goes to the your stats download page and downloads all the CSV files linked from it. You can see that page by going to this url: http://www.flickr.com/photos/YOURUSERNAME/stats/downloads/ (fill in your username). It then makes a tar.gz of these to save space.

How:
How about we let it tell you:

greg@zen:~/src/scripts$ ./flickr-stats-export.sh –help

Usage: ./flickr-stats-export.sh DIRECTORY USERNAME COOKIES

DIRECTORY:
Directory to save the flickr-stats.tar.gz file of stats .CSVs

USERNAME:
Your flickr.com username

COOKIE:
See the -b flag from the CURL manpage.
It can be the contents of a cookie file or the full filename of the cookie file.
I recommend getting the cookie file from flickr using Firebug, then saving that
in the directory you plan to save the stats files.

If there is already a cookiejar.txt file in the download directory,
we will use that instead and this can be left blank.
See the -c flag from the CURL manpage for more on cookiejars.

As you can see, it needs your flickr cookies to run, so, 1) Install Firebug and Firecookie 2) Login to flickr 3) Go to the cookie tab in Firebug, then the Cookies dropdown and select “Export Cookies For This Site.” 4) Save that file somewhere.

I run this form my server, so I copied that cookies.txt file to the ~/backup/flickr/stats/ directory and then ran
./flickr-stats-export.sh ~/backup/flickr/stats/ grggrssmr /backup/flickr/stats/cookies.txt

I would suggest running this automatically so you don’t miss any stats. But, you only need to do it monthly as the stats csv files are only updated every first of the month. So, I have this in my crontab:
0 12 1 * * /home/greg/src/scripts/flickr-stats-export.sh /home/greg/backup/flickr/stats/ grggrssmr

Notice that I left off the cookies.txt? That is because after the first time it runs it saves the cookies in a “cookiejar.txt” file in the stats directory, and if that file is there, it uses it.

That cron job runs at Noon (Eastern time zone, where my server is) on the 1st day of every month. Why? This data will only be available until June 1st, 2010 at Noon PDT (Pacific time zone). So, I picked a time 3 hours before the data will disappear so that I A) won’t miss it and B) give it time to generate my data for the month of May. After June, you can remove this from your crontab as it won’t do much after the files are gone.

Luckily, if you forget to remove the script’s entry from your cronjob file after that date, it will just exit if it doesn’t have any .csv urls to download. So, it then won’t try to make a tar.gz of empty files and save empty data over your last good flickr-stats.tar.gz

Future Research
So, with those two things you have the photos and the statistics from your flickr account. However, that isn’t everything. I am working on extending flickrtouchr to also download the photo metadata (title, description, tags, comments, license) which it doesn’t save. With that metadata I will either create a metadata xml file associated with the jpg or embed the info INTO the jpg using the XMP standard (see: python-xmp-toolkit). You can see what I’m doing at this launchpad branch. Please feel free to branch it and help out!

A quick backup script for you tonight

I just got back from a great day at the Ubuntu Michigan LoCo edition of the Global Jam were we tested Lucid on a ton of different hardware. It was a great time. See the photos.

But, what I want to share with you right now is a quick script I whipped up to backup my Google Calendars nightly. This is one of the steps in my on-going process of making sure all of my personal data is backed up by me on machines I control with an eye to migrating to self (or friend) hosted services. Yes, I want services I use to follow the Franklin Street Statement.

Until the day that all of the services I use follow the Franklin Street Statement recommendations, I will just have to make sure I make personal backups of my information. So tonight, I finally did that for Google Calendars. It was pretty simple, really:

#!/bin/sh
# Backup my Google Calendars

WORK="/home/greg/backup/google/work-`date +%F`.ics"
PERSONAL="/home/greg/backup/google/personal-`date +%F`.ics"
OPENMICHIGAN="/home/greg/backup/google/open_michigan-`date +%F`.ics"
MILOCO="/home/greg/backup/google/miloco-`date +%F`.ics"

wget private_url_for_work_calendar -O $WORK
wget private_url_for_personal -O $PERSONAL
wget private_url_for_otherwork -O $OPENMICHIGAN
wget private_url_for_the_loco -O $MILOCO

# Remove files that are older than 1 week
find /home/greg/backup/google/*.ics -mtime +7 -exec rm -f {} \;

That’s it. Create the filenames for the various calendars I’m backing up, including today’s date. Then wget them. Then, delete any .ics file that is older than a week. Not sure why I need 7 days of backup, but better safe than sorry, I guess.