Artificial Intelligence’s Past, Present, and Future: An Interview with Liz O’Sullivan – Harvard Political Review

Liz O’Sullivan is an expert in fair algorithms, surveillance, and artificial intelligence (AI) and the current CEO of AI company Parity. In 2019, she publicly quit her job at Clarifai because of their work enabling lethal autonomous weapons. Since then, she has been a major figure working to make AI fairer and less biased. 

Reader’s guide — This article references key technica…….

npressfetimg-6870.png

Liz O’Sullivan is an expert in fair algorithms, surveillance, and artificial intelligence (AI) and the current CEO of AI company Parity. In 2019, she publicly quit her job at Clarifai because of their work enabling lethal autonomous weapons. Since then, she has been a major figure working to make AI fairer and less biased. 

Reader’s guide — This article references key technical concepts that relate to AI. Please note the following definitions:

  • Algorithm: a set of rules used to solve a problem.
  • Model: the result of an algorithm applied in a certain context; for example, weather forecast models comprise algorithm(s) that use data about prior patterns to predict future weather. 
  • Training a model: since predictions are based on provided data, a model will extrapolate patterns in such a “training dataset” to then use as the basis for prediction. Thus, any biases baked into training data will be perpetuated in real use cases. For example, if a model is trained on data about people’s occupations where all teachers are female, future predictions about a person’s occupation or sex will “stereotype” teachers as female. 

This interview, conducted by Aishani Aatresh and Shira Hoffer, has been edited for length and clarity.

Harvard Political Review: You got a philosophy degree during your undergrad, and while there are obviously some intrinsic convergences between AI and ethics, what got you interested in AI? How did you translate undergrad philosophy into what you were doing at Clarifai?

Liz O’Sullivan: When I did my philosophy undergrad, I wanted to get involved in something that was at the intersection of a highly technical concept and professional ethics. And at the time, I was a science major, so I really thought that that would take the form of bioethics—but I did not make it through organic chemistry! When I got into the real world, I immediately looked for the most interesting startups I could find. This was in the early days of AI, so we were not worried about explainability, or discriminatory bias because put simply, it worked equally poorly on everybody. Instead, we were just trying to make it work. 

When I received the opportunity to work at Clarifai, there had been several jumps forward in the tech. Number one, computing power played a major role in the viability of technologies like computer vision. But also I am thinking of leaps forward in the [underlying] science and math, so seemingly overnight, it became possible to not just commercialize AI, but for individual users to use it with nothing more than an Amazon instance, an open-source algorithm, and a scrape [of data] off the internet. The reality for a long time was this norm of scraping [data] being okay, [the rationale being] it’s not hurting anybody, right? When it’s your training, it just seems normal, but you have to be very careful about what you show [a computer]. Otherwise, you will get all kinds of results. The story I always tell is that we had image annotators all over the world, and we sent off a moderation dataset [a set of data that is labeled with certain tags and then used to train a model looking for what content needs to be flagged], expecting it to come back with labels of movie ratings—this is PG, this is G, for example. However, we found that all of the fully clothed, same-sex couples holding hands or kissing were labeled as X-rated. That was because at the time in India [where the data set was labeled] homosexuality was illegal; cultural difference was a blind spot. 

It wasn’t until I met Rumman Chowdhury of Parity One and Jiahao Chen of JPMorgan Chase and Capital One, people who have been at the forefront of pushing this change in norms, that I really saw a clear path forward. We [have been] answer[ing] questions around what mitigations against bias will work and what definitions of fairness are correct in each context. Our [Parity’s] technology tries to show you — based on data — what process will get you the fairest and most accurate version of any particular model. I think this is going to move the needle forward, make models fairer, increase financial access to underserved populations, and help companies understand that there are ways to retain accuracy and fairness while being commercially viable.

HPR: If a model is as biased as whatever you feed into it, how does using AI to find bias in AI work when the first AI will be biased in and of itself

O’Sullivan: There are different kinds of bias in AI, and there is good bias —the statistical bias — that means your model is predictive, and then there is discriminatory bias, which comes in when you are trying to predict the quality or the behavior of a human in the future or make a judgment based on such information. All of these different things rely on stereotyping. We worry the machine will find a stereotype inadvertently, cling to it, and make that connection over and over again. 

But there are also plenty of kinds of AI with limited applications that do not involve inferring anything about a particular person. We do want those models to be biased in order to be predictive — we just do not want them to discriminate or stereotype people, so the models that we use are fully vetted for any kind of discriminatory bias. The simple act of models vetting other models limits the exposure of discriminatory bias, because there are stereotypes for that model to latch onto. Those are the kinds of biases that we need to be looking out for. A lot of this work comes down to qualitative assessments. 

You cannot ever cure a model of bias — nor is bias an “on-off” switch — but you can commit to a practice of continually reducing it. The only way that we can get AI to a “less than harmful” state is if we dedicate ourselves to reducing the disparities every time we see them.

HPR: In deciding what is “less than harmful,” how do you reconcile the fact that many of these decisions are being made by private companies and technical experts, rather than, for example, social scientists, or people whose lives are being impacted?

O’Sullivan: That is precisely what we are working to change. One of our company values is to reject the status quo. We blaze a trail which allows people to understand that ignoring the input of people who have a say in this kind of thing is not economically impossible, but lazy. We know that regulation is coming. [In fact, i]t is not just coming, it is here — the AI Act is in deliberation right now. It [taking the input of others] requires risk and impact assessments and is not a very easy thing to do in every case, but it would benefit the business to seek out that kind of input. 

Technical people — who have a hard time understanding legalese and risk — cannot continue to work in silos and just chase a particular metric. But there are plenty of people within an organization who are tasked with mitigating harm and minimizing risk, who should be the ones who need to translate these technical concepts into potential consequences. 

Our platform has two sides — the people creating risk and people managing it. We want to facilitate productive conversations between those two teams [to do a better job mitigating harm]. Right now, there is a brick wall between models and production, where most companies are too scared to move forward with anything because they cannot predict — or think they cannot predict — how it is going to fail, but I think we do know enough now to predict many of the ways this technology will be harmful. With access to better resources and research from academia that has been around for years, we can move forward with some comfort and safety. The tricky part is figuring out which tools are actually going to offer that safety and which ones are simply just claiming to.

HPR: Do you think that with this technology, there will come a point where AI is less biased than humans? Why would AI necessarily become any better than humans at making decisions when humans can incorporate context in a way that AI cannot? 

O’Sullivan: Nobody is saying that human decisions are blanketly better or worse since they are sometimes better and sometimes worse. The worry is when people make claims that AI is already less biased than humans. There are so many examples that counter that claim — for instance, there has been a lot of pushback on the COMPAS recidivism algorithm recently. [This algorithm’s predictions of individuals’ likelihoods to recommit crimes has been used in sentencing.] Some claim not that biased after all, or maybe it is less biased than humans. To me, there is an absolute vacuum of research to prove either claim — people say AI is less biased or more biased, but nobody knows. I would love to see some quantitative data on that, but I think it would be something that’s really difficult to measure. 

That being said, when we think about algorithms specifically within the realm of discriminatory bias and stereotyping, then we rely on the inputs to the model, which can have value judgments built into them. For instance, let’s say we have full access to COMPAS — the outcomes, the algorithms, the training data, everything. If we were to audit it and do a disparity analysis, we could probably get to a point where some metric makes it look pretty fair. But, if we dig into the input questionnaire some of these questions criminalize poverty, such as “have you ever had a struggle paying your bills?” or “have you ever shoplifted?” These reflect the fact that there is still this human element, which can have an outsized impact. It takes expertise, care, and rigor to figure out what those questions are. To some degree, I just know them when I see them. But there are also predictable failure modes that we can avoid, given all the research that has been introduced in the last few decades.

HPR: Some of your writing has touched on fully autonomous AI, especially regarding decision-making about drone strikes or the types of lethal autonomous weapons which were being enabled at Clarifai. What would have to happen for you to be in favor of fully autonomous AI? Who would get to decide what is ethical, and whether or not the human needs to be involved in the operationalizing phase, whether for weapons specifically or in general? 

O’Sullivan: There are already plenty of systems that do not have human oversight; every application of AI is not necessarily a terminator scenario. Programmatic ads are showing you a particular kind of advertisement outside of highly regulated categories, and there are plenty of low-risk use cases where perhaps the harm is fairly limited. For the high-risk use cases, you need at least the bare minimum of checking regularly to make sure that the world has not changed. If your algorithm is staying the same, but everything else has evolved, then that is still sort of human involvement. I believe that we as a species are primed to place too much weight on the importance or the impartiality of machines. So no, I do not believe that, for high-risk scenarios, we will ever get to a point where we can just let the AI run on its own. We need to validate that it is actually representing our values on a regular cadence and incorporate new knowledge of what we have learned and been able to generate through research and science. 

If you are asking me about weapons, I do not believe that it is appropriate to have a fully autonomous killing machine that is making decisions about who gets to live and die. That, to me, is a dystopian scenario that we can easily avoid by simply not arming robots. That is common sense, right? Very few people push back against me about that; the vast majority of them are defense contractors. My argument is a really hard one to prove quantitatively. It says that fully autonomous, cheap drones will make it easier to go to war, and especially easier for authoritarian nations to oppress their own people and to have assassination attempts. If we are trying to measure the number of times that a machine hits a target, that metric maybe can improve over time. But that is not the entirety of the situation [when it comes to warfare]. Ultimately, these systems do not exist in a vacuum. They exist in society, and societies; our society is democratic, and so that is why we need to consider many different viewpoints before making unilateral decisions about what and who we are.

Source: https://harvardpolitics.com/artificial-intelligences-past-present-and-future-an-interview-with-liz-osullivan/