Anthropic, the Amazon-backed artificial intelligence startup, on Monday was hit with a class-action lawsuit in California federal court over alleged copyright infringement. Three authors said in the filing that Anthropic “built a multibillion-dollar business by stealing hundreds of thousands of copyrighted books,” including their own.
Anthropic, which was founded by ex-OpenAI research executives, has backers including Google and Salesforce.
Authors Andrea Bartz, Charles Graeber and Kirk Wallace Johnson alleged in the lawsuit that “an essential component of Anthropic’s business model — and its flagship ‘Claude’ family of large language models (or ‘LLMs’)— is the largescale theft of copyrighted works,” later alleging that “Anthropic downloaded known pirated versions of Plaintiffs’ works, made copies of them, and fed these pirated copies into its models.”
The lawsuit follows Anthropic’s June debut of its most powerful AI model yet, Claude 3.5 Sonnet. Claude is one of the chatbots that, like OpenAI’s ChatGPT and Google‘s Gemini, has exploded in popularity in the past year.
“Copyright law prohibits what Anthropic has done here: downloading and copying hundreds of thousands of copyrighted books taken from pirated and illegal websites,” the lawsuit states.
Anthropic did not immediately respond to a request for comment.
This week’s case also follows another lawsuit brought against Anthropic last October, in which Universal Music sued the startup over “systematic and widespread infringement of their copyrighted song lyrics,” per a filing in a Tennessee federal court. Other music publishers, such as Concord and ABKCO, were also named as plaintiffs.
One example from Universal Music’s lawsuit: When a user asked Anthropic’s AI chatbot Claude about the lyrics to the song “Roar” by Katy Perry, it generated an “almost identical copy of those lyrics,” violating the rights of Concord, the copyright owner, per the filing. The lawsuit also named Gloria Gaynor’s “I Will Survive” as an example of Anthropic’s alleged copyright infringement, as Universal owns the rights to its lyrics.
“In the process of building and operating AI models, Anthropic unlawfully copies and disseminates vast amounts of copyrighted works,” the lawsuit stated, later adding, “Just like the developers of other technologies that have come before, from the printing press to the copy machine to the web-crawler, AI companies must follow the law.”
With the news industry broadly struggling to maintain sufficient advertising and subscription revenue to pay for its costly newsgathering operations, many news publications and media outlets are aggressively trying to protect their businesses as AI-generated content becomes more prevalent.
The Center for Investigative Reporting, the country’s oldest nonprofit newsroom, sued OpenAI and lead backer Microsoft in federal court in June for alleged copyright infringement, following similar suits from publications including The New York Times, The Chicago Tribune and The New York Daily News.
In December, The New York Times filed a suit against Microsoft and OpenAI, alleging intellectual property violations related to its journalistic content appearing in ChatGPT training data. The Times said it seeks to hold Microsoft and OpenAI accountable for ”billions of dollars in statutory and actual damages” related to the “unlawful copying and use of the Times’s uniquely valuable works,” according to a filing in the U.S. District Court for the Southern District of New York. OpenAI disagreed with the Times’ characterization of events.
The Chicago Tribune, along with seven other newspapers, followed with a suit in April.
Outside of news, a group of prominent U.S. authors, including Jonathan Franzen, John Grisham, George R.R. Martin and Jodi Picoult, sued OpenAI last year, alleging copyright infringement in using their work to train ChatGPT.
But not all news organizations are gearing up for a fight, and some are instead joining forces with AI startups.
On Tuesday, OpenAI announced a partnership with Condé Nast, in which ChatGPT and SearchGPT will display content from Vogue, The New Yorker, Condé Nast Traveler, GQ, Architectural Digest, Vanity Fair, Wired, Bon Appétit and other outlets.
In July, Perplexity AI debuted a revenue-sharing model for publishers following more than a month of plagiarism accusations. Media outlets and content platforms including Fortune, Time, Entrepreneur, The Texas Tribune, Der Spiegel and WordPress.com were the first to join the company’s “Publishers Program.”
OpenAI and Time magazine announced a “multi-year content deal” in June that will allow OpenAI to access current and archived articles from more than 100 years of Time’s history. OpenAI will be able to display Time’s content within its ChatGPT chatbot in response to user questions, according to a press release, and to use Time’s content “to enhance its products,” or, likely, to train its AI models.
OpenAI announced a similar partnership in May with News Corp., allowing OpenAI to access current and archived articles from The Wall Street Journal, MarketWatch, Barron’s, the New York Post and other publications. Reddit also announced in May that it will partner with OpenAI, allowing the company to train its AI models on Reddit content.