The Data, Responsibly project, based out of New York University, has taken its research on responsible data management and expanded it to improve messaging around what it means to collect and use data ethically.
MetroLab Network has partnered with Government Technology to bring its readers a segment called the MetroLab Innovation of the Month Series, which highlights impactful tech, data and innovation projects underway between cities and universities. If you’d like to learn more or contact the project leads, please contact MetroLab at email@example.com for more information.
In this month’s installment of the Innovation of the Month series, we explore the work of Julia Stoyanovich, an assistant professor of Computer Science, Engineering, and Data Science at New York University, and Falaah Arif Khan from Data, Responsibly, who are creating comics designed to increase awareness of responsible data science. MetroLab’s Ben Levine spoke with the two about the background and development of their project.
Ben Levine: Can you tell us about the origin of the Data, Responsibly project and who has been involved in it?
Julia Stoyanovich: The name of the project was coined in 2015. Serge Abiteboul and I used it for the first time in our “Data, Responsibly manifesto” that was published as an op-ed piece in Le Monde (in French) and, in an extended version, in the ACM SIGMOD Record blog post. Then, in summer 2016, Serge and I, together with Gerome Miklau and Gerhard Weikum, organized a Dagstuhl seminar by the same name. Dagstuhl is an academic retreat venue in Germany, and having the seminar allowed us to start building a research agenda in responsible data management. In 2016, we joined forces with Bill Howe and HV Jagadish, who have been instrumental in taking the project to a new phase — with concrete research ideas and specific applications, primarily in the urban context, that we were able to develop with generous support from the National Science Foundation.
We have been working with many brilliant students. I would particularly like to underscore the contributions of my Ph.D. student Ke Yang, who has done tremendous work on fairness, diversity and interpretability. She will be graduating in May 2021.
The next stage of this project is the Center for Responsible AI (Artificial Intelligence) that we are launching at NYU. Many of the activities of Data, Responsibly will feed into the work of the new Center.
Levine: How have the objectives of Data, Responsibly changed over time and with the arrival of new technologies?
Stoyanovich: The top-level objective has always been to help make the responsible use of data — and of technologies like AI that rely on data — the only kind that society will tolerate. It became increasingly clear to me over time that technology itself is, in some sense, the easy part of the puzzle. Or, at least, the part that we know how to start addressing. What’s much harder is understanding how to develop a good human/technology interface: how to educate people at different levels about what data and technology can and cannot do, and what we should ask and expect of it, and what we should do when technical checks and interventions are inappropriate, or when they fail. As a result, Data, Responsibly has been developing a strong focus on policy and education, in addition to the technical work. And this is why I jumped at the opportunity to work with Falaah to create the comic.
Levine: Can you tell us about what led you to create the Mirror, Mirror comic? How did the comic help you to express ideas better than other formats?
Falaah Arif Khan: The more I create scientific comics, the more I see what a natural fit they are for presenting technical ideas. There are all these nuances in the work that get lost in the trigger-happy discussion on social media or the jargon of technical papers. For the general public, there’s something about opening a technical paper that is extremely intimidating and hence isolating, and so it’s a challenge to engage the public using conventional methods. So we figured that we’d break down these arguments into relatable metaphors and depictions and make it more amenable to the general public by wrapping it in a bunch of pop culture references and funny, silly illustrations!
I think of the process of creating comics as sort of like an artist’s take on Feynman’s principle of learning: If you want to identify how well you understand something, teach it to a child. To me it has now become, if there’s a topic that I’m obsessively researching and thinking about, that I have to sit down and turn it into a three- or four-panel comic to really distill my understanding. It has enforced a first principles kind of thinking about the machine learning landscape, and so even as a creator it’s a very enriching experience to go through the process of making one of these volumes.
Stoyanovich: I have come to realize that a gap in responsible AI education — in academia, in industry, in government, among members of the public — is perhaps the greatest impediment to progress in this field. AI and ethics are just such intimidating topics on their own, but especially when we have to think about them in combination. And the whole conversation about ethics and responsibility in AI has just been so dead serious! To learn about a topic, we have to first feel that it’s within reach. And nothing is more helpful for this than a healthy dose of humor. The comic is helping us bring the necessary lightness into the conversation. It is also helping us say things directly without folks getting offended — after all, it’s a comic!
Levine: In Mirror, Mirror, you talk about the importance of understanding who a project is for: “We’re so caught up in the how, that we forget to ask, ‘For whom?’” For whom was this comic created, and how do you see them using it?
Arif Khan: This was actually the first question we asked ourselves and it drove our entire creative process. The bottom line was that we are going to make a resource that can cater to as wide an audience as possible, without compromising on the fundamental message in the piece. This meant that there were elements we had to rework repeatedly until they reached a point where the most casual reader could engage with it. Other times we removed entire parts because we found that they had become too catered to one demographic and wouldn’t make sense to other readers.
Like we say in the “About” page, it really is meant to be for everyone. We want it to be that random, cool thing that a layperson — who has nothing to do with AI — discovers on the Internet and immediately sends to their friends to read. We want it to be the repository of illustrations that academics turn to to add a breath of fresh air to their conference presentations or lecture slides. We want it to be a supplementary or introductory reading for an undergraduate course on AI.
When we were making it, we kept gushing about the onion-like structure that was forming, where readers of different backgrounds would have these layers to peel off and enjoy. As with any creative project, you hope that your reader finds something new each time they read it and pick up on a tiny detail in the sketch or a nuance in the wording that they hadn’t noticed the last time.
Levine: How have you approached this project with accessibility in mind?
Stoyanovich: We are not experts in accessibility, but we decided right away to make accessibility one of the focal points of the series, and of the first volume. On the implementation side of this, we learned about what it takes to create a comic that is accessible to the blind and made mistakes along the way, despite our best intentions. Amy Hurst, who directs the Ability Project at NYU, gave us lots of helpful tips that allowed us to get started. Still, the first version of the comic did not show the text in the right order when read with a screen reader. Chancey Fleet generously gave us several rounds of feedback and helped us fix accessibility bugs. Our experience shows that there is clearly a gap in the process — those of us with good intentions still lack the training and the tools and devices to embed and test accessibility features in the work we produce.
Levine: Will you continue making other comics? What are some of the next projects on the horizon for Data, Responsibly?
Khan: We absolutely plan to continue making more Data, Responsibly comics. The landscape of responsible AI is so wide and spans so many disciplines beyond computer science, such as law, policy and ethics, and so we have a bunch of follow-ups in the works!
Personally, I see scientific comics being just another deliverable at the end of a research cycle. Right now, we go work on something obsessively for months and then write a paper to present our ideas to the technical community. It’s the same thing, but instead of limiting the discussion to the academic community, we also take all the “discarded” ideas, thoughts and learnings that are not amenable to a paper and make a comic out of it for the general public.
I’m excited to see how we can impact the AI landscape and how that will in turn inform the comics. Mirror, Mirror was a primer on the spectrum of ideas and perspectives we hope to delve into in subsequent volumes, and a lot of the thoughts we presented are just ground zero for deeper discussions. My hope is that as the public develops fluency and the mainstream discourse gets more nuanced, we’re also challenged to dig deeper and break down even more “radical” ideas.
Stoyanovich: Indeed, we plan to continue! Our initial work is on a series that targets people who are already familiar with data science and AI, at least to some extent. I’m eager to get started on another series where the audience is the general public. You can find this AI comic and future comics here.
Looking for the latest gov tech news as it happens? Subscribe to GT newsletters.