Researchers have developed and studied an algorithm to help better predict online protest through the popular social media platform Twitter.
Researchers have developed and studied an algorithm to help predict online protest through the popular social media platform Twitter. Though governments could potentially use the tool to better determine the support or opposition to public projects and policies, some contend that it is not yet mature enough for use in matters of national security.
By looking at the online relationships and past activity of social media users, researchers at Arizona State University have determined that it's possible to predict who will get involved in online protests and discussions.
“We want predict the characteristics of the user's next status message given his past interactions. The use case for this is we can predict if a user will protest in his next status message using his past information. The past information we use are his status messages, his interactions …” said university researcher Suhas Ranganath. “What we find here is that, if we see what the user has interacted with, who they have interacted with in the past and what the interactions were, we can model that to predict what the next status will be.”
Fred Morstatter, another ASU researcher involved in the study, titled Predicting Online Protest Participation of Social Media Users (PDF), said their work does not translate to how many participants might take part in a physical protest, but can it can paint an accurate picture of people’s level of involvement in certain topics online.
“Basically we’ve built an algorithm that will detect whether or not a user will engage in a topic. And, as kind of a case study for the paper, we chose protest,” he said. “It’s actually not protest, it’s discussion of protest. It’s really hard to tell who is actually going to put boots on the ground and go protest. As we’ve seen in different examples like the elections, who actually goes and votes versus who talks about it online; that’s a whole other problem.”
The wealth of data from the digital activity centered on the Arab Spring in 2010 gave the team the perfect starting point to analyze online protest and discussion.
“The reason that we chose protest in this paper is we have a system which collects tweets from around the world pertaining to different topics," Morstatter said, noting the team built the algorithm right as the Arab Spring was starting to kick off. "And because of that, we have a lot of data pertaining to Arab Spring activity, so that’s why we chose protest for the paper."
Joe Yun, leader of Social Media Analytics at Technology Services at University of Illinois Urbana/Champaign, said the ability to identify trends and key influencers and online topics would provide an obvious value to companies and media networks, but is perhaps not yet mature enough for government to use in matters of national secrity.
Though Yun was not involved in the completion of this study, he reviewed the research for the purposes of this article and believes the computational abilities of algorithms like this one will continue to become more accurate with time.
“If you break it down simply, these researchers are showing that it’s not just what you talk about on Twitter from your own history that can predict whether or not you will state a strong position on a certain topic, but also adding how much others that are talking about that topic include you in their conversation about that topic,” Yun said via email. “So really it is somewhat of a way to identify influencers within certain topics on Twitter, and how likely those influencers are going to speak about that topic in the near future. Now that topic could be ISIS, but that topic could also be what happened on a TV show the night before, or Coca-Cola, etc.”
With the perceived usefulness of such an algorithm in the big data and social media space, there is also the possibility of privacy issues and other negative implications.
Yun said the perception of relative anonymity in a “crowd” on many social media platforms is often mistakenly confused for privacy.
“There are always the issues of privacy and misinterpretation of big data. Privacy is always a concern in general, but I think specifically in social media there is somewhat of a belief in transparency in numbers,” he said. “It’s akin to being in a large crowd at a concert and believing that no one could possibly be looking directly at you amongst thousands of people."
But, he added, algorithms like this clearly show how advancements in doing exactly the thing that people may believe is not possible.
“I wonder if we will truly have a society like something out of the Minority Report movie," Yun said. "It is both a fascinating thought, but also an incredibly scary thought as well. Once again, I think consumers need to be much more aware of privacy in this day and age."
Though Yun noted that the accuracy rate of around 70 percent was an improvement in the predictive algorithm space, he said the research still allows for the possibility of false-positives or false-negatives. In the national security realm, Yun said the capabilities of the algorithm could have more concerning implications than they might in the advertising realm.
“The researchers clearly showed that their algorithm is superior to previous efforts, but there is still a good portion of it that will either give false-positives or false-negatives. The scarier aspects are the false-positives. What if you are identified as someone that is an influencer of a particular protest for ISIS when in fact you have nothing to do with that group?” Yun said. “I personally would not feel too comfortable with that. Identifying those who enjoy Harley motorcycles is one thing, but identifying those who are a threat to national security is clearly a pretty huge difference.”
To date, the system has only been tested on Twitter data, but Morstatter said it could also work on Facebook because of the ability to mention other users within the platform.