About the Observatory on Social Media

This project aimed to study diffusion of information online and discriminate among mechanisms that drive the spread of memes on social media. We collected big data from public micro-blogging streams and analyzed information sharing using complex networks tools and models.

Our research followed several directions of investigation. First, we explored the correlations between online and offline events. Examples include analyses of geographic and temporal patterns in movements like Occupy Wall Street, societal unrest in Turkey, polarized communication in online discourse, partisan asymmetries in political engagement, geographic diffusion of trending topics, and the use of social media data to predict various outcomes, like elections, fashion trends, and key market indicators. The interdisciplinary nature of these efforts is illustrated by collaborations among computer scientists, physicists, journalists, political scientists, and sociologists. We also joined forces with neural scientists to uncover connections between patterns of information diffusion in social networks and the brain — a cover feature of the Neuron journal.

A major milestone of our project was the release of a public Observatory on Social Media (OSoMe) to share and explore data derived from our meme diffusion analytics, making big social data more easily accessible to social scientists, reporters, and the general public. OSoMe comprises hardware and software infrastructure with Web tools that provide end-users with the power to analyze online trends and visualize temporal, geographic, and network patterns of spreading memes and bursts of viral activity. We also provide an API to help other researchers expand upon the tools, or create "mash-ups" with other data sources. For example, we released a mash-up allowing others to study how social bots manipulate online discourse on any topic. The OSoMe applications and APIs provide an easy way to access insights about meme diffusion in social media from a growing collection of 70+ billion public tweets to date.

Another research goal was to understand how social media can be abused to manipulate public opinion. We were the first group to uncover evidence of systematic, orchestrated, and widely spread misinformation campaigns based on "astroturf" (fake grassroots movements) and social bots. Some social bots are created to deceive and harm social media users. They have been used to infiltrate political discourse, manipulate the stock market, steal personal information, and spread misinformation.

Our study of 1,200+ features characterizing online information sharing behaviors allowed us to develop accurate machine learning algorithms to classify content and its producers. Applications include a social bot detection framework and public API called BotOrNot, now widely used to scrutinize online campaigns. We were among the top three teams in a bot detection challenge organized by DARPA. In June and July 2016, our work on social bots was featured on the covers of the two top computing publications: IEEE Computer and Communications of the ACM. This research contributes to raising public awareness about how easily online discourse can be manipulated, thus mitigating the risks of abuse.

Techniques based on agent-based models allowed us to explore theories of meme diffusion by generating predictions that could be validated against empirical data collected from social media. We used these methods to study how several factors affect the manner in which information is disseminated and why some ideas cause viral explosions while others are quickly forgotten. We analyzed key factors including network communities, user interests, competition, finite attention, sentiment, and mutual interactions between traffic and network structure. This work led us to investigate how the structure of social communities can predict which memes will go viral.

The project had significant scientific and societal impact. Our software and data are used in courses on network science and social media. We trained undergraduate students from underrepresented minorities in STEM, as well as many graduate and postdoctoral students. Several former students are now employees at Facebook, Google, Amazon, and LinkedIn. In addition to OSoMe tools and data, the project resulted in several open-source software libraries and a patent application. IU R&T Corporation is in negotiation to license our BotOrNot software. Our visualization software won the WICI Data Challenge from the University of Waterloo. Additional recognition includes a best paper award at the Web Science Conference, a best poster award at the Conference on Complex Systems, and a best presentation award at the World Wide Web Conference. Our findings were disseminated through 60+ peer-reviewed publications. The venues include prestigious journals: CACM, Computer, Nature Physics, Neuron, PRL, Nature Scientific Reports; and top international conferences including KDD, WWW, ICWSM. Our work even inspired pop-culture; we worked with the television writers of The Good Wife for an episode on deception by social bots. Finally, research from this project received worldwide coverage in hundreds of articles in popular media, including Wall Street Journal, New York Times, Washington Post, USA Today, CNN, BBC, NPR, The Economist, Newsweek, The Atlantic, Politico, New Scientist, Wired, Science, and Nature.

Support

NSF logo JSMF logo

We gratefully acknowledge support from National Science Foundation award CCF-1101743 (ICES proposal on Meme Diffusion Through Mass Social Media) and James S. McDonnell Foundation complex systems grant on Contagion of Ideas in Online Social Networks, as well as a seed Data to Insight grant from the Lilly Endowment. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies.

Team

Filippo Menczer and Alessandro Flammini coordinated this research project at Indiana University and were the Principal Investigators on the NSF, JSMF, and Lilly grants.

The project contributed to the support and training of postdoctoral fellows Diego Fregolente, Ruby Wang, and Giovanni Luca Ciampaglia; and of many graduate students who were involved in various aspects of the research: Clayton A Davis, Karissa McKelvey, Mark Meiss, Jacob Ratkiewicz, Michael Conover, Lilian Weng, Qian Zhang, Huina Mao, Onur Varol, Azadeh Nematzadeh, Pablo Moriano, Alex Rudnick, Jiayi Zhu, Rachael Filper, Jasleen Kaur, Prashant Shiralkar, Xiaoming Gao, Andrew Younge, Tak-Lon Wu, Pik-Mai Hui, and Zeyao Yang, as well as undergraduate students Bryce Lewis, Kehontas Rowe, Keychul Chung, and Alex Hong.

We acknowledge the collaboration of many researchers. Alessandro Vespignani and Johan Bollen were Co-PIs on the NSF grant. Several other key collaborators at IU and other institutions contributed to various research thrusts of this project: Emilio Ferrara, Bruno Gonçalves, Przemyslaw Grabowicz, Luca Aiello, and Judy Qiu. Other collaborators include Nicola Perra, Marton Karsai, Fabio Rojas, Joseph DiGrazia, Chato Castillo, Francesco Bonchi, Rossano Schifanella, Snehal Patil, Emily Metzgar, Luis Rocha, YY Ahn, Geoffrey Fox, and Chris Ogan.

Finally we wish to acknowledge the support of the technical staff of the IU Network Science Instutute: Valentin Pentchev, Scott McCaulay, Chathuri Peli Kankanamalage, and Ben Serrette.