NetSquared enables social benefit organizations to leverage the tools of the social web.

net2 local

Net Tuesdays or Net2 Local gatherings provide a chance to connect locally with all those interested in the intersection of social technologies and social change. There are new groups forming every week: Join in!

net2 updates

The first wave of the NetSquared.org makeover is now live! There's more improvements to come, but in the meantime we'd love to hear what you think.

Blogs

Screen Scraping Tools for SEC's EDGAR Database

Update:   Continuing my theme of working backwards and getting things done faster than if I had done them the "right way" here is a tool developed by Joshua Tauberer that exposes the entire SEC EDGAR database in the way we need it exposed, and outputs RDF.   Now it is a matter of linking to this output and creating a visualization.   One more step!

Adam Perer who created an implementation of Prefuse, called Social Action, has gotten in touch and we are talking about ways to get our RDF dataset from the SEC database into a workable format GraphML or this format to use with Social Action. I am still scratching my head as to how to plug this dataset in to the software, it has to be transmogrified into one of these formats to be input into Social Action. Thanks to the Internet Archive for hosting that (rather large) file.

This should compliment CorpWatch's corporate malfeasance wiki www.crocodyl.org rather well, so I am excited about developing one more tool for citizen journalists to use.

More technical details you might be interested in if you are contributing:

Scrapers for getting the SEC EDGAR info are on this page, contributed by Josh Tauberer, but we will need someone to help setup a program to run them regularly so we can update our company information (maybe monthly). Those scrapers download the SGML (iirc) corporate ownership files (forms 3, 4, and 5) and stores them on disk. There's also a C# program to process the files and turn them into RDF. The SEC assigns unique identifiers to each corporation, so the data is better than most.

Here is the RDF dump of that information:

http://www.archive.org/download/govtrack_sec_dump/sec.n3

 

Of course, there is also resource.org, which is a great, well, resource. It is described in technical detail on Ckan.net, and contains older information from the same database. Might be moot, because the point of this mashup is to expose the current power relationships of corporations, not historic. It should be a live mashup, showing who owns who, today.

Call to action: If you can help with all or part of this project, contact:

Latest Comments

User login

Sitemap