Navigating Ambiguity: Companies Remain Silent on Compliance with California’s AI Training Transparency Regulations

California's AI Training Transparency Law

California’s New AI Law: AB-2013 Signed into Effect

Details: On Sunday, California Governor Gavin Newsom signed AB-2013, requiring companies developing generative AI systems to publish high-level summaries of the data used for training. This includes ownership, procurement methods, and potential copyrighted or personal information.

AI Companies Remain Silent on Compliance

Details: Many AI companies are reluctant to confirm their compliance with the new law. TechCrunch reached out to major players in the industry, but fewer than half responded, with Microsoft explicitly declining to comment.

Limited Responses: Compliance Confirmed by Few

Details: Only Stability AI, Runway, and OpenAI indicated their intention to comply with AB-2013. OpenAI affirmed its compliance with laws in jurisdictions it operates, while Stability expressed support for regulation that protects the public without stifling innovation.

Gradual Implementation: Disclosure Requirements and Timelines

Details: The disclosure requirements of AB-2013 do not take immediate effect. While they apply to systems released after January 2022, companies have until January 2026 to publish summaries. The law specifically targets systems available to California residents, allowing some flexibility.

Challenges in Training Data Disclosure

Details: Companies’ reluctance to disclose training data details may stem from typical training methods involving web-scraped content. Historically, AI developers would list training sources, but competitive pressures now discourage this practice.

Legal Risks: Copyright and Privacy Concerns

Details: Disclosing training data can expose developers to legal challenges, especially given the presence of copyrighted and privacy-violating content in datasets. Lawsuits over alleged misuse of training data are becoming more common.

Emerging Lawsuits: The Legal Landscape for AI Companies

Details: Numerous lawsuits have been filed against major AI companies for alleged training data misuse. Claims include unauthorized use of copyrighted books and music, as well as accusations of data scraping practices.

Implications of AB-2013: Navigating Legal Challenges

Details: AB-2013 could complicate matters for vendors who want to avoid legal troubles, as it mandates public disclosure of sensitive training data details. This may deter some companies from releasing information that could lead to litigation.

Broad Scope: Requirements for AI System Modifications

Details: The law applies broadly, mandating that any entity that “substantially modifies” an AI system must disclose relevant training data. Exemptions exist mainly for systems used in cybersecurity and defense.

Fair Use Defense: A Legal Strategy for AI Companies

Details: Many companies are relying on the doctrine of fair use as a legal defense. Some, like Meta and Google, have adjusted their terms of service to allow more user data for training, seeking to mitigate potential legal risks.

The Risk of Overstepping: Companies in Legal Grey Areas

Details: Amid competitive pressures, some companies have aggressively trained on intellectual property-protected data, despite potential legal implications. Reports suggest that Meta, Runway, and OpenAI have used copyrighted materials without authorization.

12. Title: Potential Outcomes: The Future of Generative AI Compliance

There are two possible scenarios regarding compliance with AB-2013. Courts may side with fair use advocates, or companies may withhold certain models in California to avoid legal exposure, leading to a cautious approach in compliance.

Looking Ahead: Compliance Deadline Approaches

Details: Assuming AB-2013 is not challenged, the deadline for compliance is set for January 2026, which will provide more clarity on its impact on generative AI companies and the broader legal landscape.