Want to host a partner location of SICSS 2020 at your university, company, NGO, or governmental organization? Learn more

Want to learn or teach computational social science? We’ve open sourced all our teaching and learning materials

June 14 to June 26, 2020 | Duke University

Sponsored by the Russell Sage Foundation


From the evening of Sunday, June 14 to the morning of Saturday, June 27, 2020, the Russell Sage Foundation will sponsor the Summer Institute in Computational Social Science, to be held at Duke University. The purpose of the Summer Institute is to bring together graduate students, postdoctoral researchers, and beginning faculty interested in computational social science. The Summer Institute is for both social scientists (broadly conceived) and data scientists (broadly conceived). The co-organizers and principal faculty of the Summer Institute are Christopher Bail and Matthew Salganik. In addition to the event at Duke, there will also be a number of partner locations run by alumni of the 2017, 2018, and 2019 Summer Institute.

The instructional program will involve lectures, group problem sets, and participant-led research projects. There will also be outside speakers who conduct computational social science research in a variety of settings, such as academia, industry, and government. Topics covered include text as data, website scraping, digital field experiments, non-probability sampling, mass collaboration, and ethics. There will be ample opportunities for students to discuss their ideas and research with the organizers, other participants, and visiting speakers. Because we are committed to open and reproducible research, all materials created by faculty and students for the Summer Institute will be released open source.

Participation is restricted to Ph.D. students, postdoctoral researchers, and untenured faculty within 7 years of their Ph.D. Most participant costs during the workshop, including housing and most meals, will be covered, and most travel expenses will be reimbursed up to a set cap. We welcome applicants from all backgrounds and fields of study, especially applicants from groups currently under-represented in computational social science. About thirty participants will be invited, and participants are expected to fully attend and participate in the entire two-week program.

Application materials are due Tuesday, February 25, 2020.

Faculty

Chris Bail

Chris Bail is Professor of Sociology and Public Policy at Duke University where he directs the Polarization Lab. He is also affiliated with the Interdisciplinary Data Science Program, the Duke Network Analysis Center, and the Duke Population Research Institute. His research examines political polarization, culture and social psychology using tools from the field of computational social science. He is the author of Terrified: How Anti-Muslim Fringe Organizations Became Mainstream.

Matthew Salganik

Matthew Salganik is Professor of Sociology at Princeton University where he is also affiliated with several of Princeton’s interdisciplinary research centers including the Office for Population Research, the Center for Health and Wellbeing, the Center for Information Technology Policy, and the Center for Statistics and Machine Learning. His research interests include social networks and computational social science. He is the author of Bit by Bit: Social Research in the Digital Age.

Speakers

Teaching Assistants

Participants

Pre-arrival

As we discussed in our call for applications, we have arranged two types of training prior to the event this summer. Some students have more sophisticated coding skills but little exposure to social science; other students have significant exposure to social science but lack coding skills.

Coding

The majority of the coding work presented at the 2020 SICSS will employ R. However, you are welcome to employ a language of your choice, such as Python, Julia, or other languages that are commonly used by computational social scientists. If you would like to work in R, we recommend that you complete the free RStudio Primers, which can be supplemented by the open access book R for Data Science by Garrett Grolemund and Hadley Wickham. RStudio Primers cover 6 topics: The Basics, Working with Data, Visualize Data, Tidy Your Data, Iterate, and Write Functions. If you already feel comfortable with these topics (either in R or some other language), then you do not need to complete these Primers.

If you would like more practice after completing the RStudio Primers, some other materials that we can recommend are:

Reading List

The Summer Institute will bring together people from many fields, and therefore we think that asking you to do some reading before you arrive will help us use our time together more effectively. First, we ask you to read Matt’s book, Bit by Bit: Social Research in the Digital Age (Read online or purchase from Amazon, Barnes & Noble, IndieBound, or Princeton University Press), which is a broad introduction to computational social science. Parts of this book will be review for most of you, but if we all read this book ahead of time, then we can use our time together for more advanced topics.

Also, for students with little or no exposure to sociology, economics, or political science, we have assembled a collection of exemplary papers in the core areas addressed by the Russell Sage Foundation. Neither your work nor the work we develop together at the institute need map neatly onto these categories, but if those with less exposure to social science read these, we will increase the chances of interdisciplinary cross-pollination, which we view as critical to the future of computational social science.

Future of Work

Behavioral Economics

Race, Ethnicity, and Immigration

Social Inequality

Schedule and materials

Sunday June 14, 2020

  • 6:00 - 8:00 Opening Dinner (Not open to public/No livestream)

Monday June 15, 2020 - Introduction and Ethics

  • 9:00 - 9:15 Logistics (Not open to public/No livestream)

  • 9:15 - 9:30 Introductions (Not open to public/No livestream)

  • 9:30 - 10:00 Introduction to computational social science

  • 10:00 - 10:30 Why SICSS?

  • 10:30 - 10:45 Coffee Break

  • 10:45 - 11:30 Ethics: Principles-based approach

  • 11:30 - 12:15 Four areas of difficulty: informed consent, informational risk, privacy, and making decisions in the face of uncertainty

  • 12:15 - 12:30 Introduction to the group exercise

  • 12:30 - 1:30 Lunch (Not open to public/No livestream)

  • 1:30 - 3:45 Group exercise (Not open to public/No livestream)

  • 3:30 - 3:45 Discuss group excercise (Not open to public/No livestream)

  • 3:45 - 4:00 Break

  • 4:00 - 5:30 Possible guest speaker

  • 6:00 - 7:30 Dinner & discussion (Not open to public/No livestream)

Tuesday June 16, 2020 - Collecting Digital Trace Data

  • 9:00 - 9:15 Logistics (Not open to public/No livestream)

  • 9:15 - 9:30 What is digital trace data?

  • 9:30 - 9:45 Strengths and weakness of digital trace data

  • 9:45 - 10:15 Screen-Scraping

  • 10:15 - 10:30 Coffee Break

  • 10:30 - 11:00 Application Programming Interfaces

  • 11:00 - 12:30 Building Apps and Bots for Social Science Research

  • 12:30 - 1:30 Lunch (Not open to public/No livestream)

  • 1:30 - 3:45 Group excercise

  • 3:45 - 4:00 Break

  • 4:00 - 5:30 Possible guest speaker

  • 6:00 - 7:30 Dinner & Discussion (Not open to public/No livestream)

Wednesday June 17, 2020 - Automated Text Analysis

  • 9:00 - 9:15 Logistics (Not open to public/No livestream)

  • 9:15 - 9:30 History of quantitative text analysis

  • 9:30 - 9:45 Basic Text Analysis/GREP

  • 10:15 - 11:15 Topic models/Structural Topic Models

  • 11:15 - 11:30 Break

  • 11:30 - 12:00 Dictionary-Based Text Analysis

  • 12:00 - 12:30 Text Networks

  • 12:30 - 1:30 Lunch (Not open to public/No livestream)

  • 1:30 - 3:45 Group excercise

  • 3:45 - 4:00 Break

  • 4:00 - 5:30 Possible guest speaker

  • 6:00 - 7:30 Dinner & Discussion (Not open to public/No livestream)

Thursday June 18, 2020 - Surveys in the Digital Age

  • 9:00 - 9:15 Logistics (Not open to public/No livestream)

  • 9:15 - 9:35 Survey research in the digital age

  • 9:35 - 9:55 Probability and non-probability sampling

  • 9:55 - 10:15 Computer-administered interviews and wiki surveys

  • 10:15 - 10:35 Combining surveys and big data

  • 10:35 - 10:45 Coffee break

  • 10:45 - 11:15 Group exercise introduction

  • 11:15 - 12:30 Begin group exercise (Not open to public/No livestream)

  • 12:30 - 1:30 Lunch

  • 1:30 - 3:15 Continue group exercise (Not open to public/No livestream)

  • 3:15 - 3:45 Discuss activity and open-source data (Not open to public/No livestream)

  • 3:45 - 4:00 Break

  • 4:00 - 5:30 Possible guest speaker

  • 6:00 - 7:30 Dinner & Discussion (Not open to public/No livestream)

Friday June 19, 2020 - Mass Collaboration

  • 9:00 - 9:15 Logistics (Not open to public/No livestream)

  • 9:15 - 9:45 Mass collaboration

  • 9:45 - 10:15 The Fragile Families Challenge

  • 10:15 - 10:30 Coffee break

  • 10:30 - 11:30 Participating in the Fragile Families Challenge Activity

  • 11:30 - 12:30 Working on the Fragile Families Challenge (Not open to public/No livestream)

  • 12:30 - 1:30 Lunch

  • 1:30 - 3:30 Fragile Families Challenge (Not open to public/No livestream)

  • 3:30 - 3:45 Discussion of the Fragile Families Challenge (Not open to public/No livestream)

  • 3:45 - 4:00 Break

  • 4:00 - 5:30 Possible guest speaker

  • 6:00 - 7:30 Dinner & Discussion (Not open to public/No livestream)

Saturday June 20, 2020 - Experiments

  • 9:00 - 9:15 Logistics (Not open to public/No livestream)

  • 9:15 - 9:30 What, why, and which experiments?

  • 9:30 - 9:45 Moving beyond simple experiments

  • 9:45 - 10:15 Four strategies for making experiments happen

  • 10:15 - 10:30 Coffee break

  • 10:30 - 11:00 Zero variable cost data and musiclab

  • 11:00 - 11:15 Break

  • 11:15 - 12:15 Possible guest lecture on experiments

  • 12:15 - 12:30 Logistics (Not open to public/No livestream)

  • 12:30 - 1:30 Lunch (Not open to public/No livestream)

  • Afternoon off

Sunday June 21, 2020 - Day off

Monday June 22, 2020 - Work on group projects

  • 9:00 - 9:15 Logistics (Not open to public/No livestream)

  • 9:15 - 12:30 Research Speed Dating (Not open to public/No livestream)

  • 4:00 - 5:30 Possible guest speaker

  • 6:00 - 7:30 Dinner & Discussion (Not open to public/No livestream)

Tuesday June 23, 2020 - Work on group projects

  • 12:30 - 2:00 Participant flash talks (Not open to public/No livestream)

  • 4:00 - 5:30 Possible guest speaker

  • 6:00 - 7:30 Dinner & Discussion (Not open to public/No livestream)

Wednesday June 24, 2020 - Work on group projects

  • 12:30 - 2:00 Participant flash talks (Not open to public/No livestream)

  • 4:00 - 5:30 Possible guest speaker

  • 6:00 - 7:30 Dinner & Discussion (Not open to public/No livestream)

Thursday June 25, 2020 - Work on group projects

  • 12:30 - 2:00 Participant flash talks (Not open to public/No livestream)

  • 6:00 - 7:30 Dinner & Discussion (Not open to public/No livestream)

Friday June 26, 2020 - Present group projects

  • 12:30 - 5:30 Final project presentation and discussion (Not open to public/No livestream)

  • 5:30 - 5:45 Evaluation (Not open to public/No livestream)

  • 5:30 - 5:45 Break

  • 6:00 Closing dinner and Group Picture (Not open to public/No livestream)

Saturday June 27, 2020

  • Participants depart