Scientific R&D organizations often have a hard time wringing out the full use of their data. Some data spill out in instrument readouts, legacy platforms, electronic laboratory notebooks, or even on paper. “In fact, we had one company we worked with where we digitized information for them, and they found, I think it was, 50–75% of the content had been duplicated,” says Jennifer Sexton, director of custom services for CAS, a division of the American Chemical Society specializing in scientific knowledge management. “So they had actually redone the experiments because they didn’t know they had done them.” Other companies had “storage sheds full of these lab notebooks,” she says.

And the data aren’t always neatly linked. As a curator of chemical knowledge, CAS recognized this need for cohesion and its outcome: a lack of artificial intelligence readiness. Fragmented data ingested by machine learning tools yield inaccurate output and wasted investment. To provide chemical and pharmaceutical organizations with a platform for data harmonization, it built the CAS Intelligence Hub, which launched in January.

The cloud-based Hub prepares companies’ proprietary data for AI ingestion. It can also combine company data with CAS reference data to enhance model accuracy. Essentially, CAS applies some of its own information-structuring tools to customer data, “combining our decades of curation expertise with secure infrastructure that accelerates discovery and enables confident AI adoption,” CAS president Manuel Guzman said in a press release.

“Content management and data management are something we’ve been doing for a very long time. You could argue it’s been 117 years from the very beginning of doing this,” says Bryan Harkleroad, director of solution development at CAS.

And humans are still involved in the curation process, Sexton assures. “We have always found that it is the scientists, like . . . readers of [Chemical & Engineering] News, we are all scientists, we understand our science the best,” she says. “So as our models digitize information, there’s always that human scientist in the loop.”

While the Hub is intended to speed innovation, it may have additional effects. One of its use cases is to be able to train AI models to do chemistry, says Harkleroad, which may transform human jobs. The platform is also intended to reduce AI hallucinations, but scientists would still need to confirm the accuracy of any output. And even though AI tools can reduce inefficiencies in some chemical processes, the technology makes hefty demands of energy and water resources.

When asked what excites him about the new Hub, Harkleroad recalled the frustrations of being unable to find an experiment he knew had been run when he worked as a chemist for a pharmaceutical company. “I think being able to standardize companies’ data, so they get the value that they have put into it. The amount of money that’s spent in this area, just the idea of them not being able to utilize it is somewhat criminal,” he says.

“And I think that’s what people love about being a scientist is that ability to create and to solve and to think. And that’s what, ultimately, I hope we’re able to enable.”

Jennifer Sexton, director of custom services, CAS

Share

Sexton appreciates that CAS’s offerings can empower scientists to invest themselves in more high-value work. “I hope that we’re taking some of the paperwork, some of the busywork off of them so that our scientists are able to go and solve more-complex problems in their brains. And I think that’s what people love about being a scientist is that ability to create and to solve and to think. And that’s what, ultimately, I hope we’re able to enable,” she says.

Harkleroad tells C&EN that future plans for the Hub might involve building it out as a platform that customers can use along with other CAS tools to perform the digitization themselves. “So, as we go on this journey, our customers are going to come with us,” he says.

As for the future of CAS-data-enriched AI models, Sexton imagines “the power of what you could do if you connected the world’s scientific literature with your own internal collections. What kind of connections could you uncover there? How could you accelerate that science? It becomes very exciting.”

“Once you have a base of harmonized data, it doesn’t matter how fast things change,” says Harkleroad. Those data can be reused or applied in new ways for new technologies that come along.

Sydney Smith

Chemical & Engineering News

ISSN 0009-2347

Copyright ©
2026 American Chemical Society