TCAI Bill Guide: Washington HB 2503, AI training data transparency

Washington’s HB 2503 would require generative AI developers to post high-level documentation regarding the data used to train the AI system. (Illustration: Getty Images for Unsplash+.)

During the 2026 legislative session, TCAI will offer clear, plain-language guides to some of the most important AI-related bills introduced in state legislatures across the country.

Jan. 22, 2026 — Washington’s HB 2503 is an AI training data transparency bill that builds on the standard set by California’s Training Data Transparency Act enacted in 2024.

Washington legislators got their first full discussion of HB 2503 at the House Technology, Economic Development, and Veterans Committee hearing on Jan. 21, 2026. The video clip below is taken from that hearing.

The original full text of HB 2503 is here, and revised versions are available here.

Brief summary

HB 2503 requires the developer of a generative artificial intelligence system to post documentation regarding the data used to train the system.

Bill sponsors: Representatives Shavers, Duerr, Parshley, Kloba, Hill, Pollet and Ramel.

HB 2503 overview

Before a generative AI system, or a substantial modification to a generative AI system, is made publicly available to Washingtonians for use, the developer of the system is required to post documentation regarding the data used to train the generative AI system.

 Developers must comply with documentation requirements based on the generally acknowledged state of the art, without compromising intellectual property rights or trade secrets.

High-level summary information: The required documentation must include a high-level summary of the datasets used in the development of the generative AI system, including:

  • the sources of the datasets;

  • a general description of how the datasets further the intended purpose of the system;

  • the number of data points included in the datasets, which may be in general ranges;a high-level description of the types of data points within the datasets;

  • whether the datasets include any data protected by copyright, trademark, or patent, or whether the datasets are in the public domain;

  • whether the datasets were purchased or licensed by the developer;

  • whether the datasets include personal information or aggregate consumer information;

  • whether there were any modifications to the datasets by the developer;

  • the dates the datasets were first trained or the date of the last significant update to the datasets; and

  • whether the system used or continuously uses synthetic data generation in its development.

Exclusions/exemptions: A developer is not required to post documentation for a generative AI system that:

  • has the sole purpose of helping to ensure security and integrity;

  • has the sole purpose of operating aircraft in the national airspace;

  • is developed for national security, military, or defense purposes and that is made available only to a federal entity; or

  • is subject to the federal Food, Drug, and Cosmetic Act.

Enforcement: A violation of the bill's requirements is deemed to constitute an unfair or deceptive act in trade or commerce for purposes of the Consumer Protection Act.

Appropriation: None.

Effective date:  The bill takes effect 90 days after adjournment of the session in which the bill is passed. Developers must comply with this requirement by January 1, 2027, for generative AI systems, or substantial modifications of such systems, that are released on or after January 1, 2022.

HB 2503 Sponsor testimony: Rep. Clyde Shavers

Rep. Clyde Shavers (D-Whidbey Island) sponsored the bill. Shavers testified on behalf of the measure on Jan. 21, before the House Technology, Economic Development, and Veterans Committee, below.

An excerpt from Rep. Shavers’ remarks:

“At its core, House Bill 2503 is about building trust with a practical transparency standard for training data.

So in other words, using the most common sense approach, we're talking about an ingredients label, what goes in, what goes out.

The core idea is if a developer makes a generative AI system publicly available to Washingtonians, people should know things like where the data comes from, what kind of data is included, how large is the dataset, does it include personal information.

This is really basic information that we see in pretty much anything that changes our lives, or tries to change what we perceive, what we hear, what we see.

This basic high level core information helps researchers understand bias and performance. It helps consumers understand risk. And it helps creators and businesses navigate these legitimate questions about licensed versus public domain content.”

LEARN More: AI training data transparency

Previous
Previous

AI Legislative Update: Jan. 23, 2026

Next
Next

TCAI Bill Guide: Washington HB 2225 and SB 5984, companion chatbot safety bills