Ongoing Projects

Online Public Opinion and Refugee Allocation

Using Twitter data I study how Germans react to the allocation of refugees in close geographic proximity. I find that German Twitter users show more interest in the topic before the facility opens.

Improved Data Collection from Online Sources with Quey Expansion and Active Learning

This is the first chapter of my dissertation. I propose the use of query expansion and active learning to improve data collection from large online databases such as Twitter.

Privacy Protection for Natural Language
with Alexander Ororbia and Joshua Snoke

We propose to use character level neural networks to generate synthetic text in order to allow data release while protecting the privacy of individuals producing the text

Published Projects

Text as Policy: Measuring Policy Similarity Through Bill Text Reuse
with Bruce Desmarais, Matthew Burgess, Eugenia Giraudy

We propose the use of text-sequencing algorithms, applied to legislative text, to identify bills that introduce similar policy proposals.

Human Rights Text as Data
with Chris Fariss, Charles Crabtree, Zachary M. Jones, Megan Biek,Taranamoll Kaur, Ana Ross, and Michael Tsai

We introduce and make publicly available a large corpus of digitized primary source human rights documents.

Exploratory Data Analysis with Random Forest
Journal of Open Source Software
with Zachary M. Jones

An R package to use Random Forest for exploratory data analysis. Most of the Software is written by Zach Jones. There is also a working paper that introduces a bit more of the theory for a social science audience.