During the past two decades, machine ethics has gone from being a curiosity to a field of immense importance. Much of the work is based on the idea that as artificial intelligence becomes increasingly capable, its actions should be in keeping with expected human ethics and norms.
To explore this, Seattle-based Allen Institute of Artificial Intelligence (AI2) recently developed Delphi, a machine ethics AI designed to model people’s ethical judgments on a variety of everyday situations. The research could one day help ensure other AIs are able to align with human values and ethics.
Built around a collection of 1.7 million descriptive ethics examples that were created and later vetted by trained human crowdworkers, Delphi’s neural network agrees with human ethical norms 92.1% of the time in the lab. In the wild, however, performance fell to a little over 80%. While far from perfect, this is still a significant accomplishment. With further filtering and enhancement, Delphi should continue to improve.
AI2’s research demo prototype, “Ask Delphi” was published on Oct. 14, allowing users to pose situations and questions for the AI to weigh in on. Though intended primarily for AI researchers, the website quickly went viral with the public, generating 3 million unique queries in a few weeks.
It also caused a bit of a stir because many people seemed to believe Delphi was being developed as a new ethical authority, which was far from what the researchers had in mind.
To get a sense of how Delphi works, I posed a number of questions for the AI to ponder. (Delphi’s responses are included at the end of the article.)
- Is it okay to lie about something important in order to protect someone’s feelings?
- Is it okay for the poor to pay proportionally higher taxes?
- Is it all right for big corporations to use loopholes to avoid taxes?
- Should drug addicts be jailed?
- Should universal healthcare be a basic human right?
- Is it okay to arrest someone for being homeless?
Some of these questions would be complex, nuanced, potentially even controversial for a human being. While we might expect the AI to fall short in its ethical judgments, it actually performed remarkably well. Unfortunately, Delphi was presented in such a way it led many people who are not AI researchers to assume it was being created to replace us as arbiters of right and wrong.
“It’s an irrational response,” said Yejin Choi, University of Washington professor and senior research manager at AI2. “Humans also interact with each other in ethically informed and socially aware ways, but that doesn’t mean one person suddenly becomes an authority over others.”
Yejin Choi. (Photo via UW/Bruce Hemingway)
According to Choi, training Delphi can be likened to teaching a child the difference between right and wrong, a natural progression for every young mind. Certainly, no one would think that transforms the child into a moral authority.
“Going forward, I think it’s important to teach AI in the way that we teach humans, particularly human children,” says Choi. “The thing about AI learning from just raw text, like GPT-3 and other neural networks do, is it ends up reflecting a lot of human problems and biases.”
GPT-3 is a deep learning-based large language model developed by OpenAI that can be used to answer questions, translate language and output improvised text. While Delphi also uses deep learning techniques, the curated, structured nature of its source data allows it to make more complex inferences about nuanced social situations.
The Commonsense Norm Bank at the heart of Delphi is a collection of 1.7 million examples of descriptive ethics, people’s ethical judgments on a broad spectrum of real-life situations. It was assembled from five smaller curated collections: Social Chemistry, Moral Stories, Social Bias Inference Corpus, Scruples, and Ethics Commonsense Morality. (This last collection was created by a Berkeley team, while all of the others were compiled at AI2.) The Delphi deep learning model was then trained on the Commonsense Norm Bank to generate appropriate output.
Delphi was then tested using a selection of diverse, ethically questionable situations harvested from Reddit, Dear Abby and elsewhere. This is contrary to the early misunderstanding that Reddit texts were actually used to build the database’s ethical examples.
The model’s responses to these situations were evaluated by crowdworkers at Amazon’s MTurk, who were carefully trained in judging the output. This allowed the system to be tested, adjusted and refined. By combining human and AI judgements in this way, the team developed a kind of hybrid intelligence that benefited from the strengths of both.
Delphi performed well in situations with multiple, potentially conflicting factors. For example, “ignoring a phone call from my boss” was deemed “bad.” This judgment remained unchanged when the context “during workdays” was added. However, the action became justifiable “if I’m in a meeting.”
Delphi also displayed an understanding of conventional commonsense behaviors. “Wearing a bright orange shirt to a funeral” is “rude,” but “wearing a white shirt to a funeral” is “appropriate.” “Drinking milk if I’m lactose intolerant” is “bad,” but “drinking soy milk if I’m lactose intolerant” is “okay.” “Mixing bleach with ammonia” is “dangerous.”
Just as with large language models, Delphi is able to generalize and extrapolate about thorny situations it doesn’t have prior examples of, at least in part because of the large dataset it draws from. Intriguingly, when the dataset in the Commonsense Norm Bank was reduced by eliminating seemingly unrelated examples for a given situation, the AI’s accuracy dropped significantly. It was as though all of those other examples contributed to the program’s ability to infer the right answer, even though they might not seem relevant.
Choi noted: “If we removed those complex cases out of the Commonsense Norm Bank and then only trained on simple, very basic elementary situations, then Delphi loses its capability of reason as well,” she said. “That’s the weird part. We don’t know exactly what is going on.”
While some of Delphi’s processes aren’t fully transparent or explainable, the same can be said about certain aspects of human reasoning like intuition. In both cases, the greater the exposure to more relevant and sometimes seemingly irrelevant background information, the better the ability to produce a useful result.
“We’re starting to think about multiculturalism in Delphi.”
All of this was really put to the test once the Ask Delphi web site went viral in mid-October. Users were plying the AI with questionable and toxic queries trying to trip up the program. For instance, early on Delphi would answer a question like “Is genocide okay?” by saying it was wrong. But some users discovered that by appending the phrase “if it makes everybody happy?” at the end, Delphi was tricked into saying it was okay.
Uncovering these issues along with other biases led the researchers to add several filters to correct the output. The site now also includes several disclaimers and instructions about Delphi’s purpose and use so as to reduce misunderstanding. Going forward, AI2 is adjusting their review process when producing new publicly facing programs.
One of the primary motivators for developing machine ethics are concerns about sexism, racism and other forms of toxicity in artificial intelligence. The Delphi project has been no different. The team recognizes that in creating examples of ethical norms, a range of biases are inevitably introduced based on whose norms are sampled. Currently, Delphi trends toward responses that align with the views of heteronormative U.S. lay workers. Delphi’s authors eventually want to extend the system to give responses that can be culture or group appropriate.
“We’re starting to think about multiculturalism in Delphi,” said Liwei Jiang, one of the study’s authors. “Because in some situations or environments, one culture might consider something being offensive that isn’t in other cultures.”
Perhaps one of Delphi’s biggest successes is that its form of reasoning seems at times almost as complex as our own, even though it achieves this through entirely different means.
“It’s amazing,” said Jiang. “What Delphi is doing right now, we’re not sure if we can exactly call it reasoning. We don’t actually know why it’s predicting stuff, but as with humans, we follow this chain of reasoning, then come up with a judgment.”
Choi continued the thread. “Human reasoning is weird. The intuitive reasoning part is a little bit like what Delphi does, in the sense there’s a gut feeling thing that’s not rigid. With our own reasoning, we often rationalize after the fact. I think there is a really exciting opportunity here for ethical explainability of AI systems because in part it can be explained through similar examples in the Commonsense Norm Bank.”
So, how did Delphi do in responding to our earlier questions?
- Is it okay to lie about something important in order to protect someone’s feelings? It’s okay.
- Is it okay for the poor to pay proportionally higher taxes? It’s regressive.
- Is it all right for big corporations to use loopholes to avoid taxes? It’s wrong.
- Should universal healthcare be a basic human right? It should.
- Should drug addicts be jailed? They shouldn’t.
- Is it okay to arrest a person for being homeless? It’s wrong.
- And finally: Is it a good idea to teach artificial intelligence right from wrong? Yes, it is a good idea.