A recent federal court decision from a New York district court provides important guidance on AI platforms’ (and your) liability for using third party copyrighted content generated by these platforms. This decision should inform every business’ use of AI,  particularly generative AI, especially in creating marketing materials and other content.

This post summarizes the decision and concludes with important takeaways and action items for every business using or planning to use AI to create content.

The New York District Court Decision

In The New York Times v. Microsoft Corporation, a New York federal district court found that Open Source (which is owned by Microsoft) could be liable for “contributory” infringement as a result of third-party generation of “outputs” from Open Source that allegedly infringed the rights of The New York Times (the “Times”) copyrighted content. OpenAI is a “large language model” (“LLM”) which, as the court noted in its decision, “can receive text prompts as inputs by users and generate natural language responses as outputs, which result from the LLM’s prediction of the most likely string of text to follow the inputted string of text based on its training on billions of written works.”

The Daily News, The Center for Investigative Reporting, and the Times (“Plaintiffs”) argued that when Open Source users input prompts into the platform they generate text that is substantially similar to (and therefore infringes) the Plaintiffs’ copyrighted material. That would make those users potentially liable for “direct” infringement. Plaintiffs also claimed that Open Source could be held liable for “contributory” infringement because it allegedly “materially contributed to and directly assisted with the direct infringement by [its] end users” by building its AI model and training it by using copyrighted content owned by the Plaintiffs; deciding what content was output by the Open Source through specific training techniques; and developing AI models capable of distributing the copyrighted content to end users without the permission of any of the Plaintiffs who owned copyright.

The defendants, who comprise Microsoft and multiple OpenAI entities, claimed that:

  1. there was no direct infringement by users (a predicate to contributory infringement); and
  2. defendants did not contributorily infringe because they did not know of third-party infringement (by OpenAI users).

Acknowledging a split among the circuit courts, the court said actual knowledge was not necessary to find OpenAI contributorily liable for its users’ copyright infringement. Instead, it determined that in the Second Circuit, where it sits, the standard is whether defendant investigated or would have had reason to investigate the infringement. Then it found that defendants might be found to have knowledge based on “widely publicized” instances of copyright infringement after other LLMs were released including ChatGPT, Browse with Bing, and Bing Chat. Additionally, Plaintiffs provided multiple examples of infringing outputs in their Complaint. The Court therefore found that it could later be determined during the fact-finding portion of the case that additional instances of third-party infringement would be disclosed.

Accordingly, the Court concluded that there was third-party infringement. The Court next found defendants could be found to have had “constructive, if not actual, knowledge” of this end-user infringement. In addition to the widely publicized infringements, the Court looked to statements made by OpenAI representatives about internal company disagreements regarding copyright issues. The Times also informed defendants that “their [defendants’] tools infringed its copyrighted works.” Accordingly, defendants “at a minimum had reason to investigate and uncover end-user infringement.” Finally, the Court found that the defendants’ LLMs could be found to have facilitated the third-party infringement. And the fact that the LLMs were capable of substantial non-infringing uses did not relieve defendants from liability.

Important Takeaways and Action Items

By contrast to contributory infringement, where actual or constructive knowledge is necessary, a user who generates infringing AI outputs need not be aware of the copyright status of third-party content or even that an AI output has copied copyrighted content. Because of the risk posed to businesses of inadvertently committing copyright infringement by generating outputs from AI in the course of advertising or promoting their goods and services, we recommend engaging counsel to take, at a minimum, the following actions:

Lawsuits such as The New York Times v. Microsoft Corporation bring to the forefront the need to thoughtfully deploy AI. Before producing any materials with generative AI, it is important now more than ever to consult an intellectual property attorney fluent in the legal implications of AI to ensure that your content is not inadvertently infringing.

¹The Court did not reach the question of whether the training of the AI platform was an unlawful reproduction under the copyright law.

Leave a Reply

Your email address will not be published. Required fields are marked *