This is the first chapter of my dissertation. I propose the use of query expansion and active learning to improve data collection from large online databases such as Twitter.
We propose the use of text-sequencing algorithms, applied to legislative text, to identify bills that introduce similar policy proposals.
We introduce and make publicly available a large corpus of digitized primary source human rights documents.
An R package to use Random Forest for exploratory data analysis. Most of the Software is written by Zach Jones. There is also a working paper that introduces a bit more of the theory for a social science audience.