GitHub’s Copilot program removes the licenses of thousands of software developers, perhaps millions, leading to what is effectively software piracy on a massive scale. That is what’s at the center of an important class action lawsuit filed by Joseph Saveri Law Firm and programmer Matthew Butterick, in San Francisco’s federal court. Along with GitHub, the plaintiffs are also suing parent company Microsoft and its AI partner OpenAI.
Copilot Displays Copyrighted Code with No Attribution
Copilot, which was trained on GibHub’s public repositories, displays lines of code from just a simple prompt. It is a popular app among coders, who can avoid having to write endless boilerplate code. However, there have been concerns among open-source enthusiasts since its inception in June 2021. The product appears to auto-complete with large chunks of code, as written, which may violate what’s known as copy-left licenses, which lets software be modified and distributed, but only if rights are preserved.
In 2021, GitHub’s CEO at the time, Nat Friedman, maintained that copyright law considers it “fair use” to train machine learning (ML) systems on public data and that the code output is the property of the operator. He anticipated that there will be an “interesting policy discussion” around the use of intellectual property (IP) and AI (artificial intelligence).
The complaint alleges that the software strips the attribution, license terms and copyright from the code, which is a violation of licenses and rights. Copilot then presents the code, which is now anonymous, to its users as if the application created it.
The reproduced code can be traced back to open-source licensees or repositories. However, Copilot never attributes the code to its original authors.
The plaintiffs are requesting a jury trial and seek damages and injunctive relief as a result of the company’s unlawful practices.
Copilot Lawsuit Raises Legal Questions
This Copilot class action suit raises legal questions relating to the company’s actions. They include DMCA violations, contract-related conduct, unlawful competition conduct and privacy violations. One consideration is the EU’s Copyright Directive of 2019, which permits text and data mining, even in the case of commercial use. It requires rights owners to opt out from having their copyrighted works scraped for data mining.
Some lawyers believe that reading and copying code within an AI process is permitted by open-source licenses and that AI-generated output is not covered by copyright law.