Safiya U. Noble, Ph.D., is an assistant professor at the University of Southern California Annenberg School for Communication and Journalism. Her work over the past few years centers on the impacts of technology on the public interest. But before that, she spent 15 years in marketing and advertising for various Fortune 500 companies. Her latest research, as discussed in her book Algorithms of Oppression: How Search Engines Reinforce Racism, examines the algorithmic bias of commercial search engines. Noble recently talked to Government Technology about her findings and her ideas for a more democratic information infrastructure.
Tell me about your current research. How did you get going in this direction?
I wrote this new book to think about the implications of the kind of information that search engines — large, commercial search engines like Google — provide to the public. And this primarily comes from my research. My area of expertise is information science, and I have watched slowly over time the erosion of support for libraries and archives and other kinds of information institutions that the public is highly reliant upon, like public libraries, for example, or academic libraries for scholars and students. Increasingly people are starting their queries for information on the Internet. Of course, as you’re looking for information, if you don’t know exactly where it lives, you don’t know the precise URL, you’re reliant upon a search engine. It’s in some ways the guide, broker or facilitator to finding things. What I noticed several years ago is that when you are looking for information for various identity-based groups, there were a lot of problems.
This started in about 2009, 2010, when I was looking, for example, at what happens when you do keyword searches on the words “black girls,” “Latina girls” or “Asian girls.” And what you would get is overwhelmingly pornography or hypersexualized content. This is a huge problem because the porn industry has a lot of money and they’ve been able to pay, in essence, to have a premium to those words, to associate those words with their content. For me this was a public interest issue because, first of all, the content was really about women, it was not girls. It wasn’t children or adolescents. The content reflected women in a pornographic way, and women and girls really don’t have the financial resources to compete with the porn industry to put the kinds of content that they might be interested in up against pornography. A commercial search engine is really an advertising platform, and so people who are willing to pay the most are able to kind of control certain keywords and ideas. And that’s really what launched me into this line of research about how information gets biased and what happens to people who have the least amount of resources in our society in terms of how they get represented in these platforms.
I also started thinking about, what is beauty? What is beautiful? I didn’t expect, in Google image search, to get back almost exclusively white women in bikinis or lingerie. You might think of a concept like “beautiful” being nature or something that’s more universally conceptualized as beautiful. If you look for images of professors — I’m a professor — it’s almost exclusively men who get represented. These things become important particularly for young people who are exploring the Internet in unguided ways and maybe they’re imagining themselves in a future occupation. There’s a lot of gender stereotyping in occupational images in Google image search.
On a darker note, we have some more troubling case studies that we’ve seen. One in particular is the case of Dylann Roof, who in his own words in his manifesto says that he was doing Google searches to understand the media circus that was happening around Trayvon Martin and George Zimmerman. He really didn’t understand what the polarizing issue was around that trial, and he does a search on the keywords “black on white crime” and is immediately led to white supremacist sites. This is because the phrase “black on white crime” is a red herring that gets used by white supremacists, white nationalist sites and organizations. Research shows that most people who engage with large search engines think that the information they’re getting is accurate and trustworthy, fair, credible. So Dylan Roof thinks these are legitimate sites, and he says that this helped him develop his racial identity, and then he acts upon that with the murders of nine African-Americans in Charleston, S.C. Those kinds of things are particularly concerning for the public. Because what he didn’t get, for example, were FBI statistics that showed that the majority of violent crimes or homicides happen within [one’s] community. So black crime, violent crime, happens against other African-Americans, but crime against white Americans is perpetrated by other white Americans, largely. That’s what the FBI stats show. You don’t get in that kind of a query any scholarly information that would help you understand the history of a phrase like that and why that’s used to radicalize white men in particular, white youth in the U.S. and in Europe. You don’t really get any context in the kinds of queries that you do, particularly when you’re searching loaded terms, and those are of great interest to the public.
This research seems particularly timely, given recent discoveries of Russian influence on the last presidential election.
I think we’re at a historic moment in the United States where we see that there are many competing interests over controlling the values and the agenda of our national political landscape. And if there were ever a time for us to have a high degree of media literacy, particularly Internet literacy, it’s now. We see what the effects of both low levels of literacy and high degrees of control of media companies, in which I include tech companies, have in shaping the outcomes of our elections. So of course, we would not want the tech sector or any other sector to so dramatically and disproportionately affect the outcomes of our political elections, so this is a moment we cannot leave to chance. This is a time where activists, community members, even teachers and educators, should be thinking about the influence of social media and search engines on the outcomes of not just electoral politics, but also how communities are represented. These are the things that are affecting our ability to enact democratic values and live in a society that we want to live in. We have an incredible amount of political and racial division, economic and class division, and we need interventions that can close those gaps. Certainly information plays a huge role in that.
What can state and local governments do, given their stated goals of serving all citizens fairly and equitably?
One of the things that I call for in the book is deeper federal, state and local investment in public-interest technologies that have a different mission, which is curating content and making different forms of knowledge more accessible. For example, we have the Library of Congress. For me, that would be an amazing partner to have at the table with scholars like me, with heads of academia, libraries from all over the United States, public and private, where we could talk about the information needs for a democracy. And how could we be relevant players in that commitment, rather than turning that over to private advertising companies, like Google search. It doesn’t make sense to allow Google or Yahoo or Bing to fill that information void because most librarians, for example, in the academic sphere, are interested in a particular type of curation of information and knowledge: books, articles, things that are published in formal channels. They’re really not interested in indexing the open Web, but they have an incredible, deep knowledge about information organization. These are the experts, so we should be tapping the experts in information organization to do that in a way that serves the public, rather than in a way that benefits advertisers.
Given your findings, where do we go from here?
I’m a media and information cultural critic. I think people would characterize my work in that way, but I also try to conceive of creative solutions and alternatives because I don’t think it’s enough to just be critical. I’ve spoken with a number of university librarians from some of the top institutions around the country, all of them very interested in having a seat at the table and thinking about these things with me. What we don’t have is the kind of money and resources that a company like Alphabet has to implement these ideas. Just like we invest in our infrastructure of highways and roads and bridges, we need to deeply invest in our information infrastructure that also is a benefit to the democracy. We wouldn’t turn over our public infrastructure to General Motors even though they benefit a lot from selling cars that run over that infrastructure. We have a lot of diversity in that industry, and we understand that there’s a place for different companies and what they’re interested in doing, but the system works when it’s a benefit to everyone, and not just those who drive cars but also those who need a train or a bus, or who need alternatives. We don’t really have a lot of good public-interest alternatives in our information infrastructures.