AI Training Is Not Fair (According to One Court)
The concept of transformative use has emerged as a pivotal issue in fair-use analysis, particularly in cases that involve training data for artificial intelligence (AI). At its core, the transformative-use inquiry asks whether the new work repurposes original material to serve a markedly different function or market than that of the copyrighted work.
In this regard, the revised summary judgment handed down today in Thomson Reuters v. Ross Intelligence provides a significant touchstone, as the U.S. District Court for the District of Delaware held that Ross Intelligence’s use of Westlaw headnotes to train its AI legal-research tool failed the transformation test. By repackaging the headnotes in a manner that directly replicated Westlaw’s legal-research service, the court found that the AI tool operated in the same market as Westlaw, thereby undermining any claim to transformative use.
This perspective resonates with the earlier 2nd U.S. Circuit Court of Appeals precedent set in American Geophysical Union v. Texaco, where the court rejected the argument that photocopying entire journal articles for research purposes constituted a transformative use. The Texaco decision elucidated that mere duplication—even if it facilitates broader research or innovation—does not suffice to transform the original work if the material is consumed in its original, unaltered form. Indeed, that case provides what may be an appropriate analogy to AI training, in which the systems are trained on the expressive content of protected works for that expressive content directly. That is, AIs “learn” from the copies they make in much the same way that humans “learn” about expressive works by directly perceiving them.
As I will discuss below, the Ross case pushes a similar line of reasoning with a different set of cases. Nonetheless, even with this potentially adverse precedent, there is a potential distinction to be made between Ross and its purpose-specific legal research tool, and general-purpose large language models (LLMs).