University of Colorado at Boulder
Summer Institute in Computational Social Science Partner Site

August 13, 2018 - August 17, 2018

Sponsored by The Russell Sage Foundation & The Alfred P. Sloan Foundation

From the morning of Monday, August 13 to the evening of Friday, August 17, 2018, University of Colorado Boulder will host a satellite of the Summer Institute in Computational Social Science. The purpose of the Summer Institute is to bring together graduate students, postdoctoral researchers, and beginning faculty interested in computational social science. The Summer Institute is for both social scientists (broadly conceived) and data scientists (broadly conceived). This satellite program is co-organized by Brian Keegan and Allie Morgan.

The instructional program will involve lectures, group problem sets, and participant-led research projects. There will also be outside speakers who conduct computational social science research in academia, industry, and government. Topics covered include text as data, website scraping, digital field experiments, non-probability sampling, mass collaboration, and ethics. There will be ample opportunities for students to discuss their ideas and research with the organizers, other participants, and visiting speakers. Because we are committed to open and reproducible research, all materials created by faculty and students for the Summer Institute will be released open source.

Participation will be open to all students, faculty and staff of universities in Colorado. We are supported by the Russell Sage Foundation, the Sloan Foundation, Center to Advance Research and Training in the Social Sciences (CARTSS), Institute of Behavioral Science (IBS), and Center for Research Data and Digital Scholarship (CRDDS).

How to Apply

If you are interested in attending SICSS Boulder 2018, please complete this application. Space for this inaugural summer institute will be limited to approximately 30 people. Meals and beverages during the institute will be provided, but we unfortunately cannot provide funding for travel or accommodation. Accepted participants should already have or be willing to develop basic proficiency with computational reasoning/programming skills in Python and/or R before attending the program. We very strongly encourage applications from people traditionally under-represented in computing.

Kindly send any inquiries to Applications will be due by June 1st.


Brian Keegan

Brian C. Keegan is a computational social scientist whose research is at the intersection of human-computer interaction, network science and data science. His research explores the structure and dynamics of large-scale online communication and collaboration using socio-technical system log data. Brian is developing new methods, theories and tools to help people make better sense of bursts of information and design better responses to them. Before joining CU-Boulder as an Assistant Professor, Keegan was a research associate at the Harvard Business School’s HBX online learning platform and a postdoctoral researcher in computational social science at Northeastern University. He received his PhD in media, technology and society from Northwestern University’s School of Communication. He also earned SB degrees in Mechanical Engineering and Science, Technology and Society from the Massachusetts Institute of Technology.

Allison Morgan

Allison Morgan is pursuing her Ph.D. in computer science at the University of Colorado, Boulder. She is interested in using data mining, machine learning, and social network analysis to develop and test hypotheses about the origins and effects of gender imbalance within academia. She attended last year’s SICSS and is excited to build a computational social science community at CU Boulder via this satellite. She was recently awarded the National Science Foundation’s Graduate Research Fellowship. Prior to graduate school, Allison worked as a data scientist for two years at a small tech start-up in Portland, OR. She earned her B.A. in physics from Reed College.

Local Speakers

Aaron Clauset

Aaron Clauset is an Assistant Professor in the Department of Computer Science and the BioFrontiers Institute at the University of Colorado Boulder, and is External Faculty at the Santa Fe Institute. He received a PhD in Computer Science, with distinction, from the University of New Mexico, a BS in Physics, with honors, from Haverford College, and was an Omidyar Fellow at the prestigious Santa Fe Institute. In 2016, he was awarded the Erdos-Renyi Prize in Network Science. Clauset is an internationally recognized expert on network science, computational social science, and machine learning for complex systems. His work has appeared in many prestigious scientific venues, including Nature, Science, PNAS, JACM, WWW, ICWSM, STOC, SIAM Review, and Physical Review Letters. His work has also been covered in the popular press by the Wall Street Journal, The Economist, Discover Magazine, New Scientist, Wired, Miller-McCune, the Boston Globe and The Guardian.

Daniel Larremore

Daniel Larremore is an Assistant Professor in the Department of Computer Science and the BioFrontiers Institute at the University of Colorado at Boulder. His research develops statistical and inferential methods for analyzing large-scale network data, and uses those methods to solve applied problems in diverse domains, including public health and academic labor markets. In particular, his work focuses on generative models for networks, the ongoing evolution of the malaria parasite and the origins of social inequalities in academic hiring and careers. Prior to joining the University of Colorado faculty, he was an Omidyar Fellow at the Santa Fe Institute 2015-2017 and a post-doctoral fellow at the Harvard T.H. Chan School of Public Health 2012-2015. He obtained his Ph.D. in Applied Mathematics from the University of Colorado at Boulder in 2012, and holds an undergraduate degree from Washington University in St. Louis.

Yotam Shmargad

Yotam Shmargad is a computational social scientist with an interest in political networks and privacy. In his research, he runs experiments, links and analyzes large datasets, and uses natural experiments to study how digital media augment the patterns of connectivity between people – the size, density, and diversity of our social networks - and the implications that these bigger networks have for our social and political lives. Shmargad’s recent projects look at how political candidates can overcome financial shortcomings with Twitter, and how the partisan composition of one’s social network influences the information they choose to share online. Before joining the University of Arizona as an Assistant Professor, Shmargad received his PhD in Marketing from Northwestern University’s Kellogg School of Management. He holds an MS in Operations Management from Columbia University and a BS in Mathematics from UCLA.

Amanda Stevenson

Amanda Jean Stevenson is a sociologist trained in demographic and computer science methods. She studies the impacts of and responses to abortion and family planning policy. She is an Assistant Professor of Sociology at the University of Colorado Boulder. In her current research, she uses demographic methods to study the impacts of reproductive health policies, and computational and qualitative methods to study social responses to these policies. At Boulder she leads a team using massive administrative data at the Census Bureau to evaluate the life course consequences of access to (as opposed to use of) highly effective contraception. And she contributes to a variety of ongoing evaluations of reproductive health policies and develops new strategies for measuring fertility with administrative data. Another line of research examines the social responses to reproductive health policies. In a current project, she uses Twitter responses, website content, media coverage, and in-depth interviews to examine the social movement response to Texas’ 2013 abortion restrictions. The case provides an opportunity to investigate how social movements negotiate intersectional critiques from within their ranks.

Chenhao Tan

Chenhao Tan is an assistant professor of computer science at University of Colorado Boulder. He obtained his PhD degree in the Department of Computer Science at Cornell University and bachelor’s degrees in computer science and in economics from Tsinghua University. Prior to joining CU Boulder, he spent a year at University of Washington as a postdoc. His research interests include natural language processing and computational social science. He has published papers primarily at ACL and WWW, and also at KDD, WSDM, ICWSM, etc. His work has been covered by many news media outlets, such as the New York Times and the Washington Post. He also won a Facebook fellowship and a Yahoo! Key Scientific Challenges award.

More speakers coming soon!

Schedule and materials

Monday August 13, 2018 - Introduction and Ethics

  • 9:00-9:15 Logistics

  • 9:15-9:30 Introduction to computational social science

  • 9:30-9:45 Why SICSS?

  • 9:45-10:00 Introductions

  • 10:00-10:45 Ethics: Principles-based approach

  • 10:45-11:00 Coffee Break

  • 11:00-12:00 Four areas of difficulty: informed consent, informational risk, privacy, and making decisions in the face of uncertainty

  • 12:00-1:00 Lunch

  • 1:00-4:00 Group Exercise

  • 4:00-5:30 Guest Speaker

Tuesday August 14, 2018 - Collecting Digital Trace Data / Mass Collaboration

  • 9:00-9:15 What is digital trace data?

  • 9:15-9:30 Strengths and weakness of digital trace data

  • 9:30-10:00 Screen-Scraping

  • 10:00-10:15 Break

  • 10:15-11:00 Application Programming Interfaces

  • 11:00-12:00 Apps for Social Science Research

  • 12:00-1:00 Lunch

  • 1:00-1:30 Mass collaboration

  • 1:30-1:40 Human computation

  • 1:40-1:50 Open call

  • 1:50-2:00 Distributed data collection

  • 2:00-4:00 Group Exercise

  • 4:00-5:30 Guest Speaker

Wednesday August 15, 2018 - Network Analysis

  • TBD

Thursday August 16, 2018 - Automated Text Analysis

  • 9:00-9:15 History of quantitative text analysis

  • 9:15-9:30 Strengths and weakenesses of quantitative text analysis

  • 9:30-9:45 Basic Text Analysis/GREP

  • 9:45-10:00 Dictionary-Based Text Analysis

  • 10:00-10:15 Break

  • 10:15-11:00 Topic models and Beyond

  • 11:00-12:00 Ngram Networks

  • 12:00-1:00 Lunch

  • 1:00-4:00 Group Exercise

  • 4:00-5:30 Guest Speaker

Friday August 17, 2018 - Experiments / Causal Inference

  • 9:00-9:15 Welcome and schedule

  • 9:15-9:45 What, why, and which experiments?

  • 9:45-10:15 Moving beyond simple experiments

  • 10:15-10:30 Coffee break

  • 10:30-11:15 Four strategies for experiments

  • 11:15-11:45 Zero variable cost data and musiclab

  • 11:45-12:10 3 Rs

  • 12:00 - 1:00 Lunch

  • TBD