
According to two significant back-to-back rulings from the Northern District of California, using copyrighted books to train large language models (LLMs) qualifies as fair use under the Copyright Act, but the opinions differ on the impact of whether the copyrighted works were legally obtained or pirated on finding fair use.
In Bartz v. Anthropic PBC, AI software developer and publisher Anthropic used millions of copyrighted books to train its core AI platform, Claude, an LLM that can recognize and generate text. Some of these books in the library database were digitized by Anthropic after legally purchasing the books, tearing off the bindings, scanning the books in their entirety, and saving them in a digitized, text-searchable format. Millions of other books, however, Anthropic downloaded for free in digital form from piracy websites. The plaintiffs in the case were authors of some of the books and sued Anthropic for copyright infringement. Anthropic asserted a fair use defense under Section 107 of the Copyright Act and moved for summary judgment.
The court’s ruling granted Anthropic’s motion in part as to the legally purchased and scanned books but denied the motion in part as to the illegally pirated books downloaded from piracy sites. Judge Alsup found that “the purpose and character of using copyrighted works to train LLMs to generate new text was quintessentially transformative.” Analogizing the AI training to a “reader aspiring to be a writer,” the court found that “Anthropic’s LLMs trained upon works not to race ahead and replicate or supplant them — but to turn a hard corner and create something different.”
For the legally purchased copyrighted books, the court found that the digitization of the books for use in training the LLM was also a fair use “because all Anthropic did was replace the print copies it had purchased for its central library with more convenient space-saving and searchable digital copies for its central library — without adding new copies, creating new works, or redistributing existing copies.”
“[B]ut these do not excuse the pirated library copies.” The court drew the line at using pirated copies of the books mass downloaded by Anthropic and used to train the LLM, explaining that “Anthropic had no entitlement to use pirated copies for its central library. Creating a permanent, general-purpose library was not itself a fair use excusing Anthropic’s piracy.” Rejecting Anthropic’s argument “that because some of the works it copied were sometimes used in training LLMs, Anthropic was entitled to take for free all the works in the world and keep them forever with no further accounting,” Judge Alsup made clear, “[t]here is no carveout, however, from the Copyright Act for AI companies.” Unlike the transformative use for the books legally purchased, digitized, and used to train the LLMs, the court ruled that “[p]irating copies to build a research library without paying for it, and to retain copies should they prove useful for one thing or another, was its own use — and not a transformative one.”
The case will now head toward trial on the claims arising from the pirated copies of the books and the resulting damages (including potentially for willfulness). The court also left open the door for claims and damages arising from “any other copies [of the books] flowing from library copies for uses other than for training LLMs.”
In a similar case from the same federal district, Kadrey v. Meta Platforms, Inc., U.S. District Court Judge Vince Chhabria issued a decision where the court denied Plaintiffs’ motion for partial summary judgment and granted Meta Platforms, Inc.’s (“Meta”) cross-motion for partial summary judgment. The decision came only two days after the Bartz opinion and addressed the same issue of use of copyrighted books to train LLMs. In this case, thirteen authors sued Meta for downloading their books from online “shadow libraries”—essentially pirate sites—and using them to train its Llama LLMs without permission. Judge Chhabria granted summary judgment to Meta, finding fair use under the Copyright Act.
Like Bartz, the Kadrey court found the use of copyrighted books to train LLMs to be highly transformative. Particularly, Judge Chhabria emphasized Llama’s new text generation and ability to perform diverse functions, including drafting emails or translating languages, as being distinct from the underlying books’ original purpose of being read for entertainment or education. However, unlike Bartz, where Judge Alsup distinguished between legally obtained and pirated books, denying fair use for the latter when used to create a general-purpose library to train AI, Judge Chhabria viewed Meta’s downloading from shadow libraries as part of the overall transformative purpose of training Llama, and not automatically dispositive of a fair use claim. Still, he characterized the piracy claims as relevant, finding piracy to be potentially indicative of bad faith and the presence of willful, criminal, and/or contributory infringement depending on the facts. Ultimately, Judge Chhabria reasoned that because the motive for the use of the books—training AI—was transformative, the intermediate act of downloading was also fair use, despite its illicit source.
Additionally, while Bartz dismissed concerns about market dilution from AI-generated works—comparing it to teaching children to write—the Kadrey court recognized that such dilution could significantly harm the market for copyrighted works, but found that the plaintiffs failed to provide sufficient evidence of this effect in their case. Specifically, Judge Chhabria criticized Judge Alsup’s analogy in Bartz that likened AI training to “training schoolchildren to write well,” arguing that this comparison downplayed the significant market harm AI could cause by enabling a single individual to generate countless competing works with minimal time and creativity. Judge Chhabria emphasized that market harm is “the most important factor in the fair use analysis,” suggesting that Bartz undervalued this factor in favor of transformativeness. Thus, both rulings affirm that training AI with copyrighted books can be transformative, but they diverge on piracy and the weight given to market impacts.
These decisions are among the first to apply the fair use doctrine to LLM training libraries at summary judgment. Judge Alsup’s detailed reasoning and legal analysis of the fair use factors offers much needed guidance on the fair use doctrine’s application to LLM training for AI platforms, and his ruling provides clear judicial endorsement of the legality of training AI on legally obtained copyrighted material. In addition, Judge Chhabria’s decision affirms the transformative nature of LLM training with copyrighted works, but emphasizes that market harm from AI-generated works is a critical factor that courts must not undervalue in fair use analysis and finds piracy not dispositive of fair use. The Bartz and Kadrey decisions illustrate that while the transformative nature of LLM training is commonly accepted, courts may differ in their treatment of how important the transformativeness factor in determining fair use, how copyrighted materials are obtained, and the assessment of market impacts.
The expression “work smarter, not harder” has become cliché; but it may apply here for LLM training with license to build and use large libraries to train their AI tools, so long as developers are careful to not conflate “smarter” with cutting corners and infringing copyrights. Training smarter includes training lawfully according to Judge Alsup, although seemingly not dispositive for Judge Chhabria. Combined, these cases provide critical insight for AI developers and their counsel in navigating copyright law.
AALRR has a dedicated group of attorneys on its Intellectual Property Team with the experience and expertise to protect and vigorously enforce your copyrights, defend you against claims of copyright infringement, and help you navigate the legal minefield at the intersection of AI and IP. If you have questions about copyright law or the intersection of AI and intellectual property, contact the authors or another member of the AALRR Intellectual Property Team.
This AALRR post is intended for informational purposes only and should not be relied upon in reaching a conclusion in a particular area of law. Applicability of the legal principles discussed may differ substantially in individual situations. Receipt of this or any other AALRR publication does not create an attorney-client relationship. The Firm is not responsible for inadvertent errors that may occur in the publishing process.
© 2025 Atkinson, Andelson, Loya, Ruud & Romo
- Partner
Brian Wheeler is a member of the firm’s Executive Committee and Chair of the firm’s Commercial and Complex Litigation Practice Group. He also leads the firm’s Intellectual Property and Data Privacy practices within the ...
- Associate
Jon Ustundag is a member of the firm’s Commercial and Complex Litigation Practice Group and Intellectual Property Team. Mr. Ustundag serves clients in commercial disputes and all facets of intellectual property law, including ...
Other AALRR Blogs
Recent Posts
- Federal Judges Find Use of Copyrighted Books to Train AI is Fair Use But Differ in How They Get There
- Trademarks in the Age of AI: The Emerging Legal Battlefield for Brand Owners and Users of Generative AI
- Considerations in Enforcing a Broad Release and Waiver of Liability Form
- Recent California Supreme Court Decision Encourages Parties to Make Reasonable Settlement Offers (aka a 998 Offer) as Early as Possible
- Recent Court of Appeal Decision Emphasizes the Importance of Establishing Ownership Interests Prior to Initiating Partition or Other Property Actions
- Treasury Department to Suspend All Enforcement of Corporate Transparency Act against U.S. Citizens and Domestic Reporting Companies
- Political Printers: Don’t be Bitten by a Union “Bug”
- Corporate Transparency Act – Nationwide Injunction Reinstated by Fifth Circuit
- Fifth Circuit Lifts the Nationwide Injunction on the Corporate Transparency Act BOI Reporting Requirements – FinCEN Extends Filing Deadline
- Alert: FinCEN Announces Limited Extensions to Corporate Transparency Act Reporting Deadlines
Popular Categories
- (27)
- (28)
- (5)
- (1)
- (6)
- (5)
- (3)
- (1)
- (15)
- (2)
- (1)
- (4)
- (1)
- (3)
- (2)
- (3)
- (2)
- (2)
- (4)
- (2)
- (5)
- (1)
- (3)
- (2)
- (1)
- (1)
- (1)
- (1)
- (1)
- (1)
- (1)
- (1)
- (1)
- (1)
- (1)
- (1)
- (1)
- (1)
- (1)
Contributors
- Cindy Strom Arellano
- Reece C. Bennett
- Eduardo A. Carvajal
- Michele L. Collender
- Scott K. Dauscher
- Christopher M. Francis
- Evan J. Gautier
- Carol A. Gefis
- Edward C. Ho
- Micah R. Jacobs
- John E. James
- Jonathan Judge
- David Kang
- Jeannie Y. Kang
- Michael H. Kang
- Joseph K. Lee
- Thomas A. Lenz
- Shawn M. Ogle
- Kenneth L. Perkins, Jr.
- Jon M. Setoguchi
- McKenna Stephens
- Jon Ustundag
- Brian M. Wheeler