Chris Bail
Duke University
Website: https://www.chrisbail.net
Twitter: https://www.twitter.com/chris_bail
Github: https://github.com/cbail
Incomplete
Inaccessible
Non-Representative
Drifting
Algorithmic Confounding
Dirty
Sensitive
-Survey response rates are low
-Survey response rates are low
-Many of our most important questions require longitudinal/relational/qualitative data
-Survey response rates continue to drop
-Many of the most important questions require longitudinal/relational/qualitative data
-Digital trace data have a number of major advantages that conventional sources do not (big, always on, non-reactive)
Incomplete
Inaccessible
Non-Representative
Drifting
Algorithmic Confounding
Dirty
Sensitive
A web or mobile-based tool built by a researcher in order to:
A web or mobile-based tool built by a researcher in order to:
a) collect public and/or private data produced by social media users from an API;
A web or mobile-based tool built by a researcher in order to:
a) collect public and/or private data produced by social media users from an API;
b) collect supplemental information from such users (e.g. demographics) using more conventional survey methods;
A web or mobile-based tool built by a researcher in order to:
a) collect public and/or private data produced by social media users from an API;
b) collect supplemental information from such users (e.g. demographics) using more conventional survey methods;
c) offer something back to the user as an incentive to share their data (e.g. analysis or financial incentives)
A web or mobile-based tool built by a researcher in order to:
a) collect public and/or private data produced by social media users from an API;
b) collect supplemental information from such users (e.g. demographics) using more conventional survey methods;
c) offer something back to the user as an incentive to share their data (e.g. analysis or financial incentives)
See: Bail 2015
Significant coding skills required (html, css, cloud-computing, reactive programming)
Competitive environment for attention (apps are no longer “new”)
Concerns about data sharing/privacy
Compelling incentives are hard to identify- and particularly challenging for studies of sensitive topics. But financial incentives may be an important option going forward.
Shiny is a (relatively) new tool that enables people to build, compile, and host interactive apps natively within RStudio
Check out the R code here.
The “face” of the app. Determines what user will see (e.g. what types of visualizations, check boxes or word boxes, fonts, etc.) Can load fancy images, logos, etc. to improve the overall appeal of the app.
The “brains” of the app- runs the analysis you want to show the user, but can also store data generated by the user, or expose different users to different types of information (good for experimentation)
Check out the Shiny Gallery here.
-Check out the googledrive
package for loading and storing data.
-High-volume app hosting is available via RStudio.
for (i in 1:24){
#Search for 50 recent tweets about computational social science
css_tweets<-search_tweets("Computational Social Science", n=50, include_rts = FALSE)
#Randomly pick one tweet which appears in the `text` variable of `css_tweets`
lucky_tweet<-sample(css_tweets$text, 1)
post_tweet(lucky_tweet)
#Pause for 1 hour
Sys.sleep(3600)
}
From Munger, 2017
Check out my tutorial on running RStudio in the cloud here.