The math behind some of our stats

If you’re interested in the math behind some of our statistics — the ideology/leadership charts and the bill prognosis scores — you might find interesting a talk I gave last week. I had the opportunity to kick off the application development track at the Law Via the Internet (LVI) 2012 conference at the Cornell Law School with my presentation “Observing the Unobservables in the United States Congress” [slides | video].

The political reality we know today is entirely manufactured. Can Big Data help us cut through the spin to see what is really going on? Yes it can. This talk will present several statistical techniques used on to quantify what is really going on in the U.S. Congress, including applying Google’s PageRank algorithm to Members of Congress, principle components analysis on bill sponsorship, and logistic regression on the success of bills.

The slides have Python code samples for computing the statistics.

I previously blogged about leadership/ideology and bill prognosis.