GSOC 2009
From Sunlabs wiki
This page is in essence a fork of the Project Ideas page a little more tailored for the 2009 Google Summer of Code. Feel free to contribute ideas here if you are hoping to participate in the Summer of Code, otherwise ideas probably belong on Project Ideas.
Contents |
Potential Student Projects
Fifty State Project
The Fifty State Project is an ambitious goal of ours to develop scrapers and parsers for all fifty state's legislative pages. A number of individuals are already contributing basic parsers for their states, two possible opportunities exist for students:
- Take a chunk of states and lead development on their parsers -- about 40 states still have no work at all, and a single state isn't much work. The barrier to entry here is very low, a student with scraping experience and dedication to this project should be able to make substantial progress towards our goal by doing a large number of states.
- Help design/develop the backend -- at the moment all of the data is being parsed into CSV files/databases, the eventual goal is for all of the data to reside in one place. This is a more advanced project and arguably more ambitious than just scraping the data, a student participant would be working very closely with a member of Sunlight labs staff were this their chosen project.
Details
Enhancements to existing Sunlight Projects
New Project Proposals
We need to use the Internet to make it so that communicating to congress, whether you're an activist, lobbyist, advocacy group, or just John Q. Public is easy and effective. It should aggregate ideas and allow people to publicly send messages to congress and allow for the public to read those messages and see what their neighbors are telling their representatives. Get Represented takes a look at the GetSatisfaction.com model and tries to apply it to Congress.
Build an open IMDB-like service for All 300,000+ elected officials in the country and build biographies on all of them much like the style of IMDB. Allow for community participation, submission and vetting.
Build blog plugins in WordPress, MovableType, etc. that allow bloggers to pull in data about lawmakers in blog posts (for instance, widgets from sunlightmediaservices.com) Could also provide, pull data from other sources like Watchdog.net, OpenSecrets, etc.
Create a bookmarklet for a web browser that scrapes the page the user is on, looks for members of Congress on that page, and then provides information about those members of Congress and the relationships between them and the other subjects in the article (Could use something like Open Calais)
To be used in conjunction with TransparencyCorps (or without), it would be a simple site where a user could create a "task" for others (in the corps) to accomplish. The tasks could be applied to every member or a subset (chamber, committee, etc). Ex: What was Senator (insert member name)'s Net Worth in 2007? This would essentially create 100 discrete tasks instead of requiring the user to input each Senator name individually. Would be useful for acquiring data that is in PDFs, etc. Could also be used to work on Name Standardization.
A simple API call that polls against the Sunlight Labs API Lobbyist Namespace and tells you whether or not a given string is a registered federal lobbyist or not.
ReCaptcha for Federal Form Data
Lots of federal data (like for instance FARAdb) come in hand-written formats or non digitizable formats. Blech! We could create a project like ReCaptcha that serves as a human validation test service while digitizing this data.
Set up a centralized bug tracker for the Federal Government to track "known issues" with technology problems (i.e. clerk office's buggy XML feed, FEC contribution erroneous data, etc) so that developers in our community don't have to discover the problems over and over again.
Really simple-- upload a spreadsheet that has members of Congress' names on them (and whatever other information is in there). Allow the user to select which column has those names in them, parse the spreadsheet and return relevant information in the Sunlight Labs API appended to the sheet.
A read/write API containing nicknames matched to known entities with some intelligent search associated with it. The API would take nicknames for people, corporations, and other nouns.
People's Agenda and Ping the President
Create an open-source, people-driven web utility to propose, discuss, and come to consensus on the agenda our government should put into action in our name. Replace the ineffective and opaque contact form on whitehouse.gov with a utility where people are guided to sign on to existing proposals, communications and requests, add their additional comments if needed, and so unite to more effectively and transparently lobby the President - and the President can actually reply to their input.
