When John McCarthy, founder of the Stanford AI Lab, coined the term “artificial intelligence” in 1955 to describe “the science and engineering of making intelligent machines,” the defining feature of this tech was its smarts. But as AI becomes more sophisticated and widespread, it’s increasingly evident intelligence can’t be the only priority – AI must be trustworthy and socially responsible too.

AI systems are trained on specific sets of data in order to learn how to behave in various scenarios. This means that any biases or errors in the training data can result in unreliable or unfair outcomes. Because AI lacks its own sense of morality or ability to fact-check its own work, humans need to take an active role in monitoring how AI systems are trained and used.

“For many applications of AI, whether it is for aerospace, medicine, or financial systems, there are unlikely but possible ‘edge cases,’” said Mykel Kochenderfer, associate professor of aeronautics and astronautics in Stanford Engineering and director of the Stanford Intelligent Systems Laboratory, discussing scenarios where problems occur under extreme or unlikely conditions. “If these edge cases are not anticipated, the system could fail, and any one failure can be catastrophic. A lack of trust in these AI systems often stands in the way of their deployment.”

Stanford researchers want to go beyond identifying edge cases and fairness gaps in AI systems; their ultimate goal is to design AI that actively improves a project’s reliability, fairness, and trustworthiness compared to doing it without AI.

“AI’s capabilities aren’t magic, they’re measurable phenomena that we can study scientifically,” said Sanmi Koyejo, assistant professor of computer science in Stanford Engineering, who leads the Stanford Trustworthy AI Research Lab. “Understanding this helps us make more informed decisions about AI deployment rather than being swayed by hype or fear. Scientific measurement, not speculation, should guide how we integrate AI into society.”

As AI’s influence grows, many Stanford researchers are working to make it more fair, cautious, and secure. The work of three faculty members offers a window into these broader efforts to build AI systems people can trust.

AI and law

“When we purchased our home, I had to sign a document that said that the ‘property shall not be used or occupied by any person of African, Japanese or Chinese or any Mongolian descent,’” said Daniel E. Ho, the William Benjamin Scott and Luna M. Scott Professor of Law at Stanford Law School.

The experience underscored for him how deeply racism is embedded in legal infrastructure – something he’s now working to help dismantle through technology. Ho directs Stanford’s RegLab, which partners with government agencies to explore how AI can improve policy, services, and legal processes.

Recently, RegLab worked with Santa Clara County, which was implementing a state law that mandated all counties identify and redact racist property deeds.

“For Santa Clara County, this meant revising about 84 million pages of records dating back to the 1800s,” Ho said. “We developed an AI system to enable the county to spot, map, and redact deed records, saving some 86,500 person hours.”

Procedural bloat – an issue highlighted by both Democrats and Republicans – is another problem RegLab addressed with AI.

“One of the odd areas of consensus right now is that government processes aren’t working terribly well,” said Ho. “We developed a Statutory Research Assistant (STARA) AI system that can identify obsolete requirements, such as reports that can needlessly consume tons of staff time.”

San Francisco City Attorney David Chiu introduced a resolution to cut over a third of these requirements based on the RegLab collaboration.

“Implemented responsibly, I think AI has tremendous potential to improve government programs and access to justice,” said Ho.

Fairness in medical AI

Koyejo’s lab has developed algorithmic methods that directly address bias in medical AI systems. When AI diagnostic tools are trained primarily on data from specific populations – such as chest X-ray datasets that underrepresent certain racial groups – they often fail to work accurately across diverse patient populations.

“My lab’s algorithms help ensure that AI systems diagnosing diseases from chest X-rays work equally well for patients of all racial backgrounds, preventing health care disparities,” said Koyejo. His team has also tackled the challenge of ‘unlearning,’ developing techniques that allow AI systems to forget harmful training data or private medical information without compromising their overall performance.

AI’s capabilities aren’t magic, they’re measurable phenomena that we can study scientifically. … Scientific measurement, not speculation, should guide how we integrate AI into society.”

Sanmi KoyejoAssistant Professor of Computer Science

Beyond health care, Koyejo’s graduate student Sang Truong collaborated with researchers in Vietnam to fine-tune an open-source large language model specifically for Vietnamese speakers. This work exemplifies how AI fairness extends beyond bias correction to ensuring technological benefits reach underserved linguistic communities.

“These applications matter because AI failures can lead to harm and exclude entire populations from technological benefits,” said Koyejo.

AI for safer systems

Although his work exists outside medicine, Kochenderfer’s research is another area where AI failures could mean life or death.

Kochenderfer studies decision making under uncertain conditions – which translates to systems used to monitor and control in air traffic, uncrewed aircraft, and automated cars. In these applications, the AI systems must account for efficiency of travel, complexity of movement (especially at high speeds), and limitations in sensor technologies that gather real-time data about how these vehicles move and what’s around them.

“Ensuring the safety of AI systems that interact with the real world is harder than many people realize,” said Kochenderfer. “We need the system to be extremely safe, even when there is a broad spectrum of plausible behavior and noise in the sensor systems.”

His current and future projects include the book Algorithms for Validation (forthcoming, free download online) and the new course Validation of Safety Critical Systems, with free YouTube lectures by postdoctoral researcher Sydney Katz.

“I am very excited to understand to what extent language models can help monitor safety critical systems,” said Kochenderfer. “Language models seem to be able to encode a wealth of commonsense knowledge. If so, they could enhance safety when automated subsystems, such as those on aircraft, fail in unexpected ways.”

What is worthy

While these examples highlight applications in law, medicine, and engineering, any discipline where there are AI tools – creative arts, communications, security technologies, and education, to name a few – offers opportunities to create AI that not only functions better and works more efficiently but also serves society in a fair, reliable, and trustworthy way.

“These applications matter because AI failures don’t just produce poor results – they can cause real harm and systematically exclude entire populations from technological benefits,” said Koyejo.

For him, the path forward requires acknowledging AI’s limitations while working systematically to address them. “Perfect AI is neither possible nor the right goal,” said Koyejo. “Instead, we should aim for AI systems that are worthy of the trust placed in them by society.”