From the evening of Sunday, June 18 to the morning of Saturday, July 1, 2017, the Russell Sage Foundation will sponsor the first Summer Institute in Computational Social Science, to be held at Princeton University. The purpose of the Summer Institute is to introduce graduate students, postdoctoral researchers, and beginning faculty to computational social science. The Summer Institute is for both social scientists (broadly conceived) and data scientists (broadly conceived). The co-organizers and principal faculty of the Summer Institute are Christopher Bail and Matthew Salganik.
The instructional program will involve lectures, group problem sets, and participant-led group research projects. There will also be outside speakers who conduct computational social science research in academia, industry, and government. Topics covered include text as data, website scraping, digital field experiments, non-probability sampling, mass collaboration, and ethics. There will be ample opportunities for students to discuss their ideas and research with the organizers, other participants, and visiting speakers. Because we are committed to open and reproducible research, all materials created by faculty and students for the Summer Institute will be released open source.
Participation is restricted to Ph.D. students, postdoctoral researchers, and untenured faculty within 7 years of their Ph.D. Most participant costs during the workshop, including housing and most meals, will be covered, and most travel expenses will be reimbursed up to a set cap. About thirty participants will be invited. Participants with less experience with social science research will be expected to complete additional readings in advance of the Institute, and participants with less experience coding will be expected to complete a set of online learning modules on the R programming language. Students doing this preparatory work will be supported by a teaching assistant who will hold online office hours during the two months before the Institute.
We are not longer accepting applications for the 2017 Summer Institute. However, we plan to have the Summer Institute in future years as well. We will circulate the call for applications widely, and both Chris and Matt will announce it on Twitter.
Matthew Salganik is Professor of Sociology at Princeton University, and he is affiliated with several of Princeton’s interdisciplinary research centers: the Office for Population Research, the Center for Information Technology Policy, the Center for Health and Wellbeing, and the Center for Statistics and Machine Learning. His research interests include social networks and computational social science. He is the author of the forthcoming book Bit by Bit: Social Research in the Digital Age.
Chris Bail is the Douglas and Ellen Lowey Associate Professor of Sociology and Public Policy at Duke University and a member of the Interdisciplinary Program on Data Science, the Duke Network Analysis Center, and the Duke Population Research Institute. His research examines how non-profit organiations and other political actors shape social media discourse using large text-based datasets and apps for social science research. He is the author of Terrified: How Anti-Muslim Fringe Organizations Became Mainstream.
Sandra González-Bailón is an Assistant Professor at the Annenberg School for Communication, and affiliated faculty at the Warren Center for Network and Data Sciences. Prior to joining Penn, she was a Research Fellow at the Oxford Internet Institute, where she is now a Research Associate. Her research lies at the intersection of network science, data mining, computational tools, and political communication. She leads the research group DiMeNet (Digital Media, Networks, and Political Communication).
Deborah Estrin is Associate Dean and Professor of Computer Science at Cornell Tech in New York City and a Professor of Public Health at Weill Cornell Medical College. She is founder of the Health Tech Hub and directs the Small Data Lab at Cornell Tech, which develops new personal data APIs and applications for individuals to harvest the small data traces they generate daily. Estrin is also co-founder of the non-profit startup, Open mHealth.
Gary King is the Albert J. Weatherhead III University Professor at Harvard University, based in the Department of Government (in the Faculty of Arts and Sciences). He also serves as Director of the Institute for Quantitative Social Science. King and his research group develop and apply empirical methods in many areas of social science research, focusing on innovations that span the range from statistical theory to practical application.
Michael Macy is the Goldwin Smith Professor of Arts and Sciences in Sociology and Director of the Social Dynamics Laboratory at Cornell. With support from the National Science Foundation, the Department of Defense, and Google, his research team has used computational models, online laboratory experiments, and digital traces of device-mediated interaction to explore a variety of familiar but enigmatic social patterns such as critical mass and mobilization, network-based contagion, and political polarization.
Winter Mason is a Data Scientist at Facebook. He studies social networks, social media, crowdsourcing, and group dynamics. His research combines traditional psychological methods such as lab experiments with new methods such as online data collection with crowdsourcing and machine learning. His research has appeared in the Proceedings of the National Academy of Sciences and the Journal of Personality and Social Psychology, among other leading journals. He received his PhD in Social Psychology and Cognitive Science from Indiana University in 2007.
Markus Mobius is a Principal Researcher at Microsoft Research who studies the economics of social networks. He builds models of learning, coordination, and cooperation within social networks, with a particular focus on trust. His research employs lab and field experiments to study social networks in real settings. His work has been funded by the National Science Foundation and the Sloan Foundation, and published in leading journals such as the American Economic Review and the Quarterly Journal of Economics. He completed his PhD in economics from the Massachusetts Institute of Technology in 2000.
Brandon Stewart is Assistant Professor of Sociology at Princeton University where he is also affiliated with the Politics Department, the Office of Population Research, the Princeton Institute for Computational Science and Engineering, and the Center for the Digital Humanities. His work develops new quantitative statistical methods for applications across computational social science. He completed his PhD in Government at Harvard in 2015. His work develops new tools for automated text analysis.
Taylor Brown is a doctoral student in the Duke Sociology department, and is associated with the Duke Network Analysis Center. She has a general fascination with computational methods and the issues that arise with social media and other found data. She holds an MA in sociology from UNC-Chapel Hill and an MSc in evidence-based social intervention from the University of Oxford. Prior to beginning her PhD, Taylor worked on issues of intercountry adoption abuse and for a non-profit in Ghana. She also fulfilled an appointment at the National Science Foundation in the division of Social and Economic Sciences.
Shuang (Yo-Yo) Chen is a doctoral student in demography and social policy at Princeton University. Previously, she worked as a consultant at Oxford Policy Management and a program officer for the International Household Survey Network/Accelerated Data Program, providing technical assistance to statistical offices in developing countries. She has also consulted for the World Bank on education projects. She holds a master’s degree in international education policy analysis and a bachelor’s degree in mathematics with honors in education from Stanford University.
Vissého Adjiwanou is a Senior Lecturer in Demography and Quantitative Methods at the University of Cape Town (South Africa), and adjunct professor at the Université de Montréal (Canada). His research interests include maternal and reproductive health, family dynamics, and female employment in sub-Saharan Africa. He will chair a session on Family transformation in SSA at the 28th International Population Conference (IPC) in Cape Town. Vissého is also interested in computational science where he tries to document discussions on gender and family formation of African immigrants in the West (Europe and North America) on social media, and their effects on their peers in Africa.
Kat Albrecht is pursuing a PhD in Sociology at Northwestern University. Her research focuses on investigating how the structure of data shapes research conclusions and broader sociological theory. Using machine learning methods, quantitative causal inference, and mapping techniques she primarily builds and analyzes large criminal justice datasets. She is especially concerned with the economics of fear, the working definition of homicide, and the general state of crime data. She received her bachelor’s degree from the University of Minnesota where she first began exploring the junction of computational methods and the social sciences.
Abdullah Almaatouq is currently a Research Assistant at the Human Dynamics group and pursuing a PhD in Computational Science at MIT. He received dual masters’ in Computational Engineering and Media Arts & Sciences from MIT, and a bachelor degree from the School of Electronics & Computer Science at Southampton University in the UK. Abdullah’s work includes conducting theoretical and empirical research on human behavior using innovative approaches and tools ranging from complex systems theory and agent-based modeling, to network analysis, econometric techniques, and behavioral and experimental methods. Abdullah is passionate about people, their stories, and how they can be understood computationally.
Lisa Argyle is a postdoctoral researcher in the Politics Department at Princeton University. She received her Ph.D. in Political Science from the University of California, Santa Barbara in 2016. Her research is in political psychology and political behavior, where she uses a combination of survey data, experiments, and computational methods to examine how people form their political opinions and express those opinions to others. She is particularly interested in understanding the role of interpersonal persuasion in democratic participation.
Elliott Ash is Assistant Professor of Economics at University of Warwick and Visiting Scholar at Princeton University’s Center for Study of Democratic Politics. Elliott earned a PhD in economics from Columbia University. Elliott’s research combines techniques from applied microeconometrics and machine learning for empirical analysis of law and politics, with a focus on text as data. Before obtaining his PhD, Elliott received a BA (Plan II) from University of Texas at Austin, a JD from Columbia Law School, and an LLM (international criminal law) from University of Amsterdam. He also provided consulting work for the Department of Justice investigation of discriminatory practices at Ferguson Police Department.
Joshua Becker is a PhD candidate with the Network Dynamics Group at the University of Pennsylvania with a professional background in facilitation and decision-making. His research on collective intelligence uses formal models and experimental tests to examine how social network structure shapes the quality of group decisions. His current research focuses on how communication networks can be harnessed to tap the wisdom of crowds and improve estimation accuracy on tasks such as financial forecasting and medical diagnoses.
Anjali Bhatt is a PhD student in Organizational Behavior at the Stanford Graduate School of Business. Prior to graduate school, she received her bachelor’s degree in physics from Harvard and spent several years consulting with organizations on their social impact strategy. Anjali’s research focuses on using computational techniques (such as agent-based modeling and text analysis) to understand organizational culture and how it emerges, evolves, and diffuses, as well as ways in which it affects and is affected by diversity and inequality in organizations.
Moritz Büchi is a Senior Research and Teaching Associate in the Media Change & Innovation Division, Institute of Mass Communication and Media Research (IPMZ), University of Zurich, Switzerland. His research examines new media use, digital inclusion and inequality, online privacy, digital well-being and overuse, and comparative and computational research methods.
Assistant Professor at Columbia Business School. He received my Ph.D. from UC Berkeley, and won the Kauffman Dissertation Fellowship and Robert Beyster Fellowship. His research interests are in applied microeconomics and strategy, particularly productivity, technological innovation, organizational economics and platforms.
Anna Filippova is a postdoctoral researcher with the Institute for Software Research at Carnegie Mellon University, where she works towards supporting sustainable open collaborative community development, particularly in the context of Free/Open Source Software and Wikipedia communities. She has received her Ph.D from the National University of Singapore. Her research interests include social norms and conflict in virtual environments, inclusive group processes in diverse teams, and the role of face-to-face events in supporting the development of online peer-production communities. She has also been involved in organizing Free/Open Source community events, such as the Abstractions conference and Ruby monthly meet-ups.
Connor Gilroy is a PhD student in sociology at the University of Washington. He studies LGBTQ communities and populations to understand social processes of visibility, acceptance, and assimilation. His current research investigates patterns of sociodemographic change in gay neighborhoods. Additionally, he has projects on improving demographic estimates of queer populations with social media data and on using agent-based models to explore the macro-level impacts of the interpersonal process of coming out as LGBTQ.
Ian Gray is pursuing a PhD in the Department of Sociology at the University of California Los Angeles. He was previously a Research Fellow at the Medialab of Sciences Po, in Paris, and received a Master in City Planning from the Department of Urban Studies and Planning at the Massachusetts Institute of Technology. He is interested in how environmental problems become economic problems and his current research is focused on the politics of calculating and preparing for the risks and impacts from climate change.
Jeffrey Jacobs is a PhD student in Political Science at Columbia University. His research aims to utilize natural language processing, network analysis, and machine learning techniques to gain new insights into the history of political thought, labor and community organizing, economic inequality, and online labor markets. Before coming to Columbia he received an MS in Computer Science from Stanford University and Bachelor’s degrees in Mathematics, Computer Science, and Economics from the University of Maryland.
Ridhi Kashyap is a postdoctoral research fellow at Nuffield College at the University of Oxford. Her educational career has spanned four countries – after an undergraduate at Harvard, she did a master’s degree between Germany and Spain, and recently finished her DPhil (PhD) in demography and sociology jointly affiliated with the University of Oxford, UK and the Max Planck Institute for Demographic Research, Germany. Her research projects span a number of substantive areas in demography, including gender and other social inequalities in demographic processes, marriage and fertility change, mortality and health, and ethnicity and migration. She is also interested in methodological innovations in population studies including agent-based, microsimulation and ‘big data’ approaches.
Antje Kirchner is a Research Survey Methodologist at RTI International and an Adjunct Research Assistant Professor at the University of Nebraska - Lincoln, Lincoln. Her research addresses challenges in survey methodology, including ways to examine nonresponse bias using machine learning techniques, adaptive/responsive design, assessing the quality of survey and administrative data, eliciting and analyzing answers to sensitive questions,detecting problems in the respondent-interviewer interaction, and how to improve response quality in web surveys using paradata. Her research has been published in journals such as Public Opinion Quarterly, Journal of Survey Statistics and Methodology, and Journal of the American Statistical Association.
Peter Krafft is a graduating PhD student at MIT co-advised by Sandy Pentland and Josh Tenenbaum, soon to be living the bohemian life of an itinerant postdoc. His main formal training is in statistics, machine learning, and computer science, but he now studies computational social science and collective intelligence, often from the perspective of cognitive science. His current research focuses on understanding how people form beliefs about the world through their own exploration and through interaction with each other.
Molly Lewis is a postdoctoral researcher at the University of Chicago and the University of Wisconsin-Madison. Her research focuses on understanding how linguistic meaning varies across development and across speakers of different languages. She is also interested in issues related to scientific replicability and reproducibility. She received her PhD in Developmental Psychology from Stanford University and her BA in Linguistics from Reed College.
Charlotte Lloyd is a PhD candidate in Sociology with a secondary field certificate in Computational Science and Engineering. Her mixed methods research, including new computational methods for social science, focuses on how symbolic and cultural boundaries are related to structural inequality within organizations and communities. Charlotte received her B.A. in Comparative Literature and Political Science from the University of North Carolina at Chapel Hill in 2011.
Allison Morgan is pursuing her Ph.D. in computer science at the University of Colorado, Boulder. She is interested in using data mining, machine learning, social network analysis and causal inference to develop and test hypotheses about the origins and effects of gender imbalance within academia. Prior to graduate school, Allison worked as a data scientist for two years at a small tech start-up in Portland, OR. She earned her B.A. in physics from Reed College.
Matti Nelimarkka is a PhD Candidate at University of Helsinki and Aalto University with background both in political science and computer science. His research interests include supporting participation in various contexts (e.g., classroom, political participation) and studying online political communication. His work spans from human-computer interaction to political science and has often interdisciplinary nature. He’s also the cofounder of computational social science study program at University of Helsinki.
Kivan Polimis is an incoming postdoctoral research fellow at Bocconi University’s Dondena Centre for Research on Social Dynamics and Public Policy. He received his Ph.D. in Sociology from the University of Washington (UW) in 2017. Recently, Kivan has enjoyed facilitating public-private partnerships as the program coordinator for UW’s Data Science for Social Good, a Civic Technology and Engagement Fellow with Microsoft, and a Big Data-Scientist Training Enhancement Program (BD-STEP) Predoctoral Fellow with the Department of Veteran Affairs. His research focuses on health development and applying statistical techniques to investigate disparities in health care, transportation, and the legal system.
Ethan Porter is an assistant professor at George Washington University in the School of Media and Public Affairs. He received his PhD in Political Science from the University of Chicago in 2016. His dissertation, The Consumer Citizen, investigates the ways in which everyday consumer decision-making affects political attitudes and behavior. His research interests include public opinion, political communication, political psychology, and experimental design. He has received grants from the National Science Foundation and the Omidyar Network. His research has appeared in Political Communication, and he has written for The New York Times, The Washington Post and other publications.
Maria Y. Rodriguez is an Assistant Professor at the Silberman School of Social Work at the City University of New York’s Hunter College. She received her Ph.D from the University of Washington (Seattle). Her research interests intersect demography, data science, housing policy and social welfare. Currently, she has three active areas of research - (1) identifying the impacts of the U.S. foreclosure crisis on Latinos; (2) exploring how supervised machine learning can be used to scale up theoretically driven qualitative coding, and (3) using Twitter to understand the lived experience of marginalized communities in the United States.
Hirokazu Shirado is a researcher in the field of social networks and human-machine interactions. he is taking courses to complete his doctorate in Department of Sociology at Yale University where he has also studied at Human Nature Lab at Yale Institute for Network Science. His current research focuses on the experimental analysis of the emergence of cooperative action in social networks. His goal is to engineer social systems with more affordable participation. His study was published by Nature, Nature Communications, and other journals.
Rochelle Terman is a post-doc at the Center for International Security and Cooperation at Stanford University. She received her Ph.D. in Political Science with a designated emphasis in Gender & Women’s Studies at the University of California, Berkeley. Her research examines international norms, gender and advocacy, with a focus on the Muslim world, using a mix of quantitative, qualitative and computational methods. She also teaches computational social science in a variety of capacities.
Adaner Usmani is a postdoctoral fellow at the Watson Institute for International and Public Affairs at Brown University. His dissertation examines the rise and fall of labor movements over the 20th and early 21st centuries, and considers the effects of these facts on politics and public opinion. In other work, he has written about American mass incarceration, with an eye on the racial politics of its origins and reproduction.
Tong Wang is an Assistant Professor of Management Sciences at the Tippie College of Business, University of Iowa. She received her Ph.D. in Computer Science from the Massachusetts Institute of Technology in 2016. Her general research interests include interpretable machine learning and applied data mining, with its application in computational criminology, healthcare, social marketing, etc. Her research on crime data mining is the second place winner in ‘Doing Good with Good OR’ at INFORMS 2015. Her work on crime data mining has been reported in multiple media including Wikipedia.
Michael Yeomans is a post-doctoral fellow at Harvard University. He studies the Behavioral Science of Big Data - how new datasets and algorithms are changing our daily life, and expanding the researcher toolbox in social science. Michael completed his undergraduate degree in Psychology at the University of Toronto and in 2014, and completed a PhD and an MBA in Behavioral Science at the University of Chicago Booth School of Business.
As we discussed in our call for applications, we have arranged two types of training prior to the event this summer. Some students have more sophisticated coding skills but little exposure to social science; other students have significant exposure to social science but lack strong coding skills.
The majority of the coding work presented at the 2017 SICSS will employ R. However, you are welcome to employ a language of your choice- such as Python, Julia, or other languages that are commonly used by computational social scientists. If you would like to work in R, we recommend that you complete the following courses within DataCamp, a website that teaches people how to code. Obviously, you only need to complete the classes with material that you would like to learn.
If you cannot afford datacamp, check out Chris Bail’s Intro to R slides at http://www.chrisbail.net/p/learn-comp-soc.html
Our institute will bring together people in more than 10 different scholarly fields, some of which are closer to social science than others. For those students with little or no exposure to sociology, economics, or political science, we have assembled a reading list which we ask that you complete prior to the event. This list includes readings in each of the core areas addressed by the Russell Sage Foundation. Neither your work nor the work we develop together at the institute need map neatly onto these categories, but we think that if those with less exposure to social science read these, we will increase the chances of interdisciplinary cross-pollination, which we view as critical to the future of computational social science.
In addition, we also ask that you read Matt’s book, Bit by Bit: Social Research in the Digital Age. Much of this book will be review for most of you, but if we all read this book ahead of time, then we can use our time together for more advanced topics.
9:00-9:15 Logistics (No livestream)
10:45-11:00 Coffee Break
1:00-4:00 Group Exercise (No livestream) activity
4:00-5:30 Lecture by Michael Macy (No livestream)
6:00-7:30 Dinner & discussion
1:00-4:00 Group Exercise (No livestream)
4:00-5:30 Lecture by Gary King video
Dinner & Discussion
1:00-4:00 Group Exercise (No livestream)
4:00-5:30 Lecture by Brandon Stewart video
Dinner & Discussion
10:00-10:15 Coffee break
11:00-12:30 Begin group exercise (No livestream) activity
1:30-3:45 Continue group exercise (No livestream)
4:00-5:30 Lecture by Sandra Gonzalez-Bailon video
Dinner & Discussion
9:00-9:10 Welcome and schedule slides
10:15-10:30 Coffee break
4:00-5:30 Lecture by Markus Mobius video
Dinner & Discussion
9:00 - 9:15 Welcome and schedule slides
10:15 - 10:30 Coffee break
12:00-12:20 Presentation - Demo / Agent Based Models and Collective Intelligence, Joshua Becker (No live-stream)
12:20-12:40 Presentation - The Wisdom of the Dynamic Network, Abdullah Almaatouq (No live-stream)
12:40-1:00 Presentation - Is the Public Polarized? Ideal Point Estimation using Real-time Debate Reactions, Lisa Argyle (No live-stream)
1:20-1:40 Tutorial - Word2vec, Jeff Jacobs (No live-stream)
1:40-2:00 Tutorial - Dependency Parsing, Elliott Ash (No live-stream)
2:00-2:30 Tutorial - Multi-Wave Field Experiments Over Mechanical Turk, Ethan Porter (No live-stream)
4:00-5:30 Lecture by Winter Mason video
12:00-12:20 Presentation - Multiplayer game / Survey on the future of computational social science, Peter Krafft (No live-stream)
12:20-12:40 Presentation - Social network experiment using Mturk and bots, Hirokazu Shirado (No live-stream)
12:40-1:00 Presentation - Using MTurk to Gauge Opinion on Crime and Punishment, Adaner Usmani (No live-stream)
1:30-2:30 Tutorial - How to get started with Python, Charlotte Lloyd (No live-stream)
2:30-3:00 Tutorial - Git and Github, Rochelle Terman (No live-stream)
3:00-3:30 Tutorial - Pre-registration, Peter Krafft (No live-stream)
4:00-5:00 Discussion - Designing a new challenge, Vissého Adjiwanou and Matt Salganik (No live-stream)
6:00 Dinner Dicussion group - Teaching Computational Social Science
12:00-12:20 Presentation - An Overview of Interpretable Machine Learning, Tong Wang (No live-stream)
12:20-12:40 Presentation - O Interpretable Machine Learning, Where Art Thou?, Bo Cowgill (No live-stream)
12:40-1:00 Presentation - Planning Prompts Increase and Forecast Course Completion in Massive Open Online Courses, Michael Yeomans (No live-stream)
1:00-1:20 Presentation - Asking Sensitive Questions in Surveys, Antje Kirchner (No live-stream)
1:30-1:45 Tutorial - Open Review Toolkit, Matt Salganik (No live-stream)
1:45-2:30 Tutorial - How to get started with STM, Taylor Brown (No live-stream)
4:00-5:30 Lecture by Deborah Estrin video
6:00 Dinner Dicussion groups - (1) Creating Tools for Social Scientists, (2) Issues in Open Source Development and Collaboration, (3) Interpretable Machine Learning
The Cultural Roots of Islamophobia, Rochele Terman, Lisa Argyle, Ian Gray, and Matti Nelimarkka
Fragile Families- Expert and Lay Opinions, Allie Morgan, Kivan Polimis, Adaner Usmani, Ridhi Kashyap
Fragile Families- Data Processing, Antje Kirchner, Anna Filippova, Connor Gilroy
Fragile Families- Modeling Strategies, Mike Yeomans, Tong Wang
Long Term, Causal Evidence on Evictions (Bo Cowgill) and Getting off Facebook (with Ethan Porter)
In search of human nature- Evidence for multilevel in-group preference amongst MTurk workers, Abdullah Almaatouq and Peter Krafft
The Future of CSS, Reflecting on Friendship, and Modeling Big Decisions, Peter Krafft
Social Dynamics in Scientific Community Networks The Structure – Content Coevolution of Research on CRISPR, Hirokazu Shirado, Moritz Buchi, & Visseho Adjiwanou
Ngram2Vec- Constructing Variable-Length Phrase Spaces, Elliott Ash and Jeff Jacobs
Highways and Words, Anjali Bhatt and Molly Lewis
Can We Reduce Political Polarization Without Communication, Joshua Becker and Ethan Porter
Dinner or a Movie? The Discourse of Emotional Labor in the Market for Escort Services, Charlotte Lloyd and Elliott Ash
TBD, Kat Albrecht and Maria Rodriguez
An important element of the Summer Institute is the participant-led group projects that begin in the second week and often continue long after the Summer Institute ends. Here are the results from some of the 2017 projects.
For those unable to attend in Princeton, we will be live-streaming each day from approximately 9:00am to 5:30pm EST. Group exercises and some of the visiting speaker’s lectures will not be live-streamed. Follow this link (https://mediacentrallive.princeton.edu) to join. At the moment, the live-stream will not work in Chrome. If paused, reload the page to rejoin the live-stream. Update: All videos have been posted to the Summer Institute in Computational Social Science YouTube channel.