Sunday, February 1, 2009

MediaWiki Stubbing

I stubbed about 18000 entries in our StimulusWatch.org wiki so that people would have a little framework to work within when adding data, thus keeping the site relatively uniform in appearance while still providing a wiki. This took far longer than I think it should have. I didn't actually time it, but we're talking on the order of 12-18 hours. I used the MediaWiki perl module. The code is incredibly simple since you get to do something like this:
$c->setup({
'bot' => {
'user' => 'accountabot',
'pass' => 'PASSWORD'},
'wiki' => {
'host' => 'stimuluswatch.org',
'path' => 'mediawiki'
}}) or die 'Error connecting to wiki';

...

while (($name,$state,$project_id, $cost, $jobs) = $sth->fetchrow_array()) {
$page_name = "$state:$name:$project_id";

$page_content = "== General Description == \n{{project-stub}}\n\n== Points in Favor ==\n\n== Points Against ==\n";
print "$page_name ";
print "\n";

do {
$rv = $c->text($page_name, $page_content);
if ($rv != 1) {
print "$rv\n";
print "error: $c->{error}\n";
}
} while ($rv != 1);

}
And you just let that rip. I don't know if this is using a MediaWiki API or screen scraping (hoping on the former but not surprised about that latter). In any case, writes to MediaWiki through this interface are dismal. There must be a quicker way, but one night of processing was just under the threshold where I care because it's done. If I do have to run this again, I'm going to need to find a better way.

0 comments: