Registration has closed


10:30am – 12:30pm Morning Session
Instructor: David Armstrong
Tutorial: Building UIs and evaluating measurement models in R
01:30pm – 04:10pm Afternoon Session
Instructors: Monojit Choudhury and Anshul Bawa
Tutorial: Analyzing human language data using Hindi film dialogues
04:10pm – 04:30pm Coffee/tea break
04:30pm – 06:30pm Evening Session
Instructors: Jule Krüger
Tutorial: Interpreting and visualizing data on violence

Morning Session

Measurement error is a pervasive problem across the social sciences. There are several ways to mitigate the negative effects that measurement error can have on our models. These range from unidimensional models (like the summated rating model and Mokken scaling and most of item response theory) to their multidimensional counterparts (like factor analysis and multidimensional scaling) through new methods like over-imputation. In the tutorial, I will walk participants through both the theoretical concerns and how to implement and evaluate the models in R.

Afternoon Session

Design challenges in computational social research

Social science research has become increasingly driven by data. But with gigabytes of social media data being generated everyday, it is often a challenge to isolate the signal from the noise. This 90-minute workshop will illustrate some of the common problems that arise when asking socially relevant questions using human-generated language data, be it social media, news or interpersonal conversations. Through a dataset of dialogs from recent Hindi movies, we will demonstrate the process of extracting social and interpersonal insights about the movie characters. For example, which characters have the most influence over the others, how does social connectedness affect language usage, and how do characters use linguistic style to convey social signals. Through this workshop, you will gain insights on:

How to deal with some of the linguistic ambiguities inherent in all language data.
How to pick features or models that best suit your research question.
How to design your own study and ask the right questions.
How to quantify a sociolinguistic phenomenon and see how different quantifications lead to different outcomes.

Evening Session

Conflict-related sexual violence has quite recently become recognized as an international security problem and is sometimes used as a weapon of war. Yet, it is perhaps one of the most hidden forms of conflict violence, and therefore a hard to observe phenomenon for conflict scholars. In this workshop, we will discuss why using a latent variable model (LVM) presents a promising avenue for making advances in our understanding of sexual violence during armed conflict. Based on ongoing research with Ragnhild Nordås, Jule Krüger will guide participants through the estimation of a latent variable model of the perpetration of sexual violence by government troops based on human rights reports by Amnesty International, Human Rights Watch, and the United States State Department. Participants will obtain an overview and sample code (R, Stan) of the estimation procedure, as well as, how to visualize country-year estimates, and examine model fit.


Dave Armstrong
I am currently Canada Research Chair in Political Methodology in the department of Political Science and Statistics, by courtesy, at Western University in London, Ontario. I teach courses on political methodology. Find out more about the courses I teach. Much of my substantive research focuses on the link between state repression and democratic institutions/behavior. Methodologically, my interests have taken me in a number of different directions on collaborative projects, though the measurement and scaling seem to be common themes.

Anshul Bawa
Anshul Bawa is a research assistant with Microsoft Research India. He completed his Bachelors and Masters in Computer Science from IIT Delhi in 2017. He works in the area of natural language processing and cognitive science, and is passionate about socially relevant computational research. He is also working on assistive technologies for children with autism.

Monojit Choudhury
Dr. Monojit Choudhury is a researcher in Microsoft Research Lab in India since 2007. His research spans many areas of Artificial Intelligence, cognitive science and linguistics. In particular, Dr. Choudhury has been working on technologies for low resource languages, code-switching (mixing of multiple languages in a single conversation), computational sociolinguistics and conversational AI. He has more than 100 publications in international conferences and refereed journals. Dr. Choudhury is an adjunct faculty at International Institute of Technology Hyderabad, and has taught courses in the past in IIT Kharagpur. He also organizes the Panini Linguistics Olympiad for high school children in India, and is the founding chair of the Asia-Pacific Linguistics Olympiad. Dr. Choudhury holds a B.Tech and PhD degree in Computer Science and Engineering from IIT Kharagpur.

Jule Krüger
Jule Krüger is a Program Manager for Big Data and Data Science Research in the Center for Political Studies at the Institute for Social Research at the University of Michigan. Jule is a conflict and violence scholar who contributes to the fields of Comparative Politics, International Relations, and Methodology. Curious about causes, dynamics, and intervention efforts, Jule is particularly interested in how data on political violence is generated. She studies how journalists and human rights defenders report and document abuses to understand the effects this has on the validity and reliability of empirical analysis. She holds a doctoral degree (2014) from the Department of Government at the University of Essex/UK, and a master of arts (2008) in Political Sciences, Law and History from University of Konstanz/Germany.