Anthropic CEO Dario Amodei gestures as he addresses the audience as part of a session on AI during the World Economic Forum (WEF) annual meeting in Davos on January 23, 2025.  FABRICE COFFRINI / AFP
WORLD

U.S. judge backs using copyrighted books to train AI

Tremendous amounts of data are needed to train large language models powering generative AI.

Agence France-Presse

SAN FRANCISCO, United States (AFP) — A US federal judge has sided with Anthropic regarding training its artificial intelligence (AI) models on copyrighted books without authors’ permission, a decision with the potential to set a major legal precedent in AI deployment.

District Court Judge William Alsup ruled on Monday that the company’s training of its Claude AI models with books bought or pirated was allowed under the “fair use” doctrine in the US Copyright Act.

“Use of the books at issue to train Claude and its precursors was exceedingly transformative and was a fair use,” Alsup wrote in his decision.

“The technology at issue was among the most transformative many of us will see in our lifetimes,” Alsup added in his 32-page decision, comparing AI training to how humans learn by reading books.

Tremendous amounts of data are needed to train large language models powering generative AI.

Musicians, book authors, visual artists and news publications have sued various AI companies that used their data without permission or payment.

AI companies generally defend their practices by claiming fair use, arguing that training AI on large datasets fundamentally transforms the original content and is necessary for innovation.

“We are pleased that the court recognized that using ‘works to train LLMs was transformative,’” an Anthropic spokesperson said in response to an AFP query.

The judge’s decision is “consistent with copyright’s purpose in enabling creativity and fostering scientific progress,” the spokesperson added.