Full report here: US Copyright Office draft on AI training is sparking controversy
May 12, 2025 — The U.S. Copyright Office issued a pre-publication version of its long-awaited report on AI training, copyright, and fair use last week.
We have the full report available below.
The most complete analysis of the new release was posted by Edward Lee, a law professor at Santa Clara University and founder of ChatGPTiseatingtheworld.com.
Lee wrote:
“Wow, this report will create a firestorm in the AI copyright lawsuits. The Copyright Office has taken sides on all of the four factors of fair use. On balance, the Office’s view seems to favor the copyright holders in the AI litigation, notwithstanding the Office’s agreement that AI training often serves a transformative purpose (especially for large AI models that require large and diverse datasets for training) under Factor 1 of fair use.”
One of the most controversial parts of the new report may be the Copyright Office’s agreement with a new theory of market dilution that has been advanced by copyright holders in the federal lawsuits now working their way through the courts. This directly impacts one of the four factors of fair use, a previously obscure legal construct that has suddenly become critically meaningful to copyright holders worldwide.
As Lee notes, the Copyright Office is venturing into untested ground with the idea of machine-powered market dilution as a harm visited upon copyright holders. Here’s how the Copyright Office put it in the report:
“While we acknowledge this is uncharted territory, in the Office’s view, the fourth factor should not be read so narrowly. The statute on its face encompasses any ‘effect’ upon the potential market. The speed and scale at which AI systems generate content pose a serious risk of diluting markets for works of the same kind as in their training data. That means more competition for sales of an author’s works and more difficulty for audiences in finding them. If thousands of AI-generated romance novels are put on the market, fewer of the human-authored romance novels that the AI was trained on are likely to be sold. Royalty pools can also be diluted. UMG noted that ‘[a]s AI-generated music becomes increasingly easy to create, it saturates this already dense marketplace, competing unfairly with genuine human artistry, distorting digital platform algorithms and driving ‘cheap content oversupply’ – generic content diluting human creators’ royalties.’”
AI governance expert Luiza Jarovsky views the report as “GREAT NEWS for content creators/copyright holders, especially as the U.S. Copyright Office's opinion will likely influence present and future AI copyright lawsuits in the U.S.”
state action on copyright and ai training later today
Even as the Copyright Office controversy had AI policy circles buzzing on Monday morning, the California State Assembly is scheduled to vote on AB 412, the AI Copyright Transparency Act, later this afternoon in a full floor session. The proposed legislation would require GenAI developers to inform copyright owners when their materials are included in GenAI training datasets.
More on that below: