Friday, January 30, 2009

StimulusWatch.org

Stimulus efforts such as the one proposed by the Obama administration almost always include far too much pork due to the sheer volume of requests. We decided to take at least one aspect of it and break it down in an easy to visualize -- and vote on -- format. Help vet the shovel-ready projects with us.

I'd like to point out that as of right now, our data is only derived from the US Mayor's report. They identified projects which they believe would stimulate the economy and make jobs. I think it would be great to expand the project to include even more potential stimulus projects, but there are no plans currently to do that.

6 comments:

inkdroid.org said...

Just curious how easy you found it to scrape the data out of the pdf? Or did you find it elsewhere?

Kevin said...

I saw the PDF first and was definitely intrigued about the prospect of scraping PDF data, but then noticed that you could get all of the projects in HTML form if you could generate the right URLs. I used lxml to parse and extract the data from the HTML. So all in all a pretty straightforward scraping. The code I used is on my code page.

inkdroid.org said...

Nice little bit of python that. That doesn't look like it took you too long.

It would be nice if these gov't sites made their data easy to download without scraping wouldn't it? You might want to consider having stimuluswatch distribute the data in machine readable form for downstream apps to use.

Kevin said...

Yeah it was probably only an hour to write and shake out any bugs. Python is power! :-)

I totally agree about making the data easy to reuse and remix. To that end, you can download the original sqlite database and an export of our database in Excel format.

inkdroid.org said...

Wow, very nice. Might be nice to advertise it on the front page of stimuluswatch.org? Or from the FAQ?

Kevin said...

I actually moved the Excel file over to the sites.pheared.net page and there is a link in the FAQ on StimulusWatch.