Somebody asked for an RSS feed because they love the Advent calendar so much, yet being a perl hacker they also exhibited the virtue of Laziness and didn't feel like checking every 20 minutes to see if I'd released the day's writeup yet. Besides, Mark used to have one. I, on the other hand, did not feel it was worthwhile for such an ephemeral urgency; paricularly since there was no question as to whether there was to be an update rather only when.On the other hand our previous guest author, William 'N1VUX' Ricker, was interested in learning a little about RSS and felt it a worthy endeavor. We thereore proudly you present with an RSS feed Enjoy!
P.S. If you're at a loss for uses/relevance of RSS you might try something like this, creating a feed for your favorite comics.
Adding RSS to the Calendar isn't as bad as I thought — there's a "Simple" module, from Sean Burke of course - XML::RSS::SimpleGen.
After the usual painless install and a little cargo-cult hacking of the POD example we're already half done ..
$ cat yapac.rss # first version
<?xml version="1.0"?>
<?xml-stylesheet title="CSS_formatting" type="text/css"
href="http://www.interglacial.com/rss/rss.css"?>
<rss version="2.0" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/">
<channel>
<!-- Generated with Perl's XML::RSS::SimpleGen v11.11 -->
<link>http://web.mit.edu/belg4mit/www/</link>
<title>YAPAC</title>
<description>Yet Another Perl (Advent) Calendar</description>
<language>en</language>
<lastBuildDate>Thu, 15 Dec 2005 02:31:27 GMT</lastBuildDate>
<skipHours><hour>0</hour><hour>1</hour><hour>3</hour>...<hour>23</hour></skipHours>
<sy:updateFrequency>1</sy:updateFrequency>
<sy:updatePeriod>daily</sy:updatePeriod>
<sy:updateBase>1970-01-01T02:40+00:00</sy:updateBase>
<ttl>1440</ttl>
<webMaster>jpierce@cpan.org</webMaster>
<docs>http://www.interglacial.com/rss/about.html</docs>
<item>
<title>1..5</title>
<link>http://web.mit.edu/belg4mit/www/5/</link>
<description>On day 5/, my true language gave to me</description>
</item>
<item>
<title>6</title>
<link>http://web.mit.edu/belg4mit/www/6/</link>
<description>On day 6/, my true language gave to me</description>
</item>
etc...
</channel></rss>
Since the main page doesn't have descriptions or titles, there's a slight problem with using the POD's main-page scrape generating descriptions. So we either have to fill in a default as above, or fetch the page (or the Header of the page) to grab <title> to fill in the <description>. However, that was already discussed on Catchup Day 1..5, so it shouldn't be too hard.
So what's left to do after the quick hack?
Let's add a little trace output so we know what it's doing ...
$ perl -I XML-RSS-SimpleGen-11.11: modXRS.pl
5/ 1..5 'YA Perl Advent Calendar 2005: Catchup'
6/ 6 'YA Perl Advent Calendar 2005: On the ordinate(6) day of X-Mas'
7/ 7 'YA Perl Advent Calendar 2005-12-07'
8/ 8 'YA Perl Advent Calendar 2005: On the 8E00000000 day of Advent my True Language brought to me...'
9/ 9 'YA Perl Advent Calendar 2005: Buzzword Bingo'
10/ 10 'YA Perl Advent Calendar 2005: Tarball Toolbelt'
11/ 11 'YA Perl Advent Calendar 2005: Conjunction Junction'
12/ 12 'YA Perl Advent Calendar 2005: re-run'
13/ 13 'YA Perl Advent Calendar 2005: A penny saved is a penny earned'
14/ 14 'YA Perl Advent Calendar 2005: Keeping it clean'
15/ 15 'YA Perl Advent Calendar 2005: SCALAR(0xdeadbeef)'
Note: It won't recreate a file if the results don't differ, so for testing
I'm rm-oving the file each time so I can see if it actually creates something.
<link rel="alternate" type="application/rss+xml" title="RSS" href="../yapac-rss.xml">
1 #!/usr/bin/env perl 2 use warnings; 3 use strict; 4 use Carp; 5 6 sub utility::get_title; 7 8 # A complete screen-scraper and RSS generator 9 # adapted from XML::RSS::SimpleGen POD 10 11 use strict; 12 use XML::RSS::SimpleGen; 13 my $url = q<http://web.mit.edu/belg4mit/www/>; 14 15 rss_new( $url, "YAPAC", "Yet Another Perl (Advent) Calendar" ); 16 rss_language( 'en' ); 17 rss_webmaster( 'jpierce@cpan.org' ); 18 # image is not supposed to be a favicon, but a GIF, skip for now. 19 # rss_image("http://yourpath.com/icon.gif",32,32); 20 rss_daily(); 21 22 get_url( $url ); 23 my @pages; # List of things to process 24 25 while( 26 # was 27 # m{<h4>\s*<a href='/(.*?)'.*?>(.*?)</a>\s*</h4>\s*<p.*?>(.*?)<a href='/}sg 28 # now must match 29 # <br><div><a href="10/" style="left: 375px; top: 255px;">10</a></div> 30 31 m{<div> \s* <a \s href="(\d+/)" [^>]* > ([^<>]*) </a> \s* </div> }xisg 32 33 ) { 34 35 my ($page, $linkText, $title)=($1,$2, undef); #$3 is empty 36 37 # Defer with agenda 38 push @pages, {page=>$page, link=>$linkText, title=>$title}; 39 } 40 41 # now work the agenda, once we've finished the previous parse. 42 for my $pageRef (@pages) { 43 my ($page, $link, $title)=(@$pageRef{qw{page link title}}); 44 $title ||= utility::get_title($page) || "Advent Calendar Page - No Title"; 45 print "$page $link '$title' \n"; 46 rss_item("$url$page", $link, $title ) ; 47 } 48 49 50 croak "No items in this content?! {{\n$_\n}}\nAborting" 51 unless rss_item_count(); 52 53 rss_save( 'yapac-rss.xml', 45 ); 54 print "success\n"; 55 56 exit; 57 58 ### Reuse HTML::HeadParser from day 5 59 package utility; 60 use LWP::Simple; 61 use HTML::HeadParser; 62 use Carp; 63 64 # not safe for mod_perl ... 65 66 sub get_title { 67 my $header = HTML::HeadParser->new(); 68 my $date = shift || croak "get_title requires arg of page name"; 69 70 my $content = get( $_ = "http://web.mit.edu/belg4mit/www/$date"); 71 72 unless( $content ) { 73 warn("No content for: $_\n"); 74 return; 75 } 76 77 $header->parse($content); 78 return $header->header('Title'); 79 }