Meta wants the courts to declare that piracy is legalβas long as the pirates are training AI.
In a legal filing thatβs sent shockwaves through the creative community, Meta (Facebookβs parent company) has argued that downloading copyrighted books via BitTorrent qualifies as fair use under U.S. copyright law. Their reasoning? The books were used to train AI language models, which constitutes βtransformativeβ use.
If Meta wins this argument, it could reshape not just AI development, but the entire foundation of digital copyrightβand your rights as a creator.
What Meta Is Actually Arguing
In ongoing copyright litigation, Meta faces claims that it illegally used copyrighted books to train its AI models. Rather than denying they used the content, Metaβs legal team is taking a bold stance:
Yes, we used pirated books. Yes, we obtained them via BitTorrent. And yes, thatβs perfectly legal.
Their argument hinges on the four-factor fair use test:
- Purpose and character of use: Meta claims AI training is βtransformativeββthe books arenβt reproduced as books, but digested into statistical patterns
- Nature of the copyrighted work: Creative works normally get more protection, but Meta argues the use is so different it doesnβt matter
- Amount used: They used entire books, but claim this was necessary for AI training
- Market effect: Meta argues AI models donβt compete with the books themselves
This is an audacious legal strategy. Theyβre not disputing the piracyβtheyβre arguing the piracy doesnβt matter.
The βTransformative Useβ Stretch
Fair use law allows certain uses of copyrighted material without permission. The most protected uses are βtransformativeββthey donβt just copy the work, they create something new.
Classic examples:
- Parody (a comedian mocking a song)
- Commentary (a reviewer quoting a book)
- Scholarship (an academic analyzing a text)
Meta is stretching this concept beyond recognition. Their argument essentially says:
βWe didnβt copy the books to read them. We copied them to feed them into a statistical model that learned patterns. The model doesnβt contain the booksβit contains patterns derived from the books. Therefore, our use is transformative.β
If this logic holds, any computational processing of copyrighted material becomes fair use. Download a movie to analyze its color palette? Fair use. Pirate music to train a recommendation algorithm? Fair use. Scrape an entire news website to build a summarization tool? Fair use.
Why BitTorrent Makes This Worse
Meta didnβt just use copyrighted booksβthey obtained them via BitTorrent, a peer-to-peer file sharing network primarily used for piracy. This adds layers of legal and ethical problems:
BitTorrent Requires Re-Distribution
When you download via BitTorrent, you simultaneously upload pieces to others. Meta wasnβt just downloadingβthey were actively distributing pirated content to other users.
The Source Was Obviously Illegal
BitTorrent book collections like LibGen and Z-Library are notorious piracy repositories. Meta canβt claim innocent mistakeβthey went to the digital equivalent of a bootleg DVD market and loaded up trucks.
It Demonstrates Intent
Going to BitTorrent for training data, rather than licensing content legally, shows a deliberate choice to use pirated material when legal alternatives existed.
What This Means for Authors and Creators
If Metaβs argument succeeds, the implications are grim:
Your Work Can Be Pirated for AI Training
Any book, article, image, song, or video can be copied without permission as long as itβs being fed into an AI model. The βtransformativeβ magic wand makes piracy legal.
No Compensation, No Consent
Unlike licensing agreements, fair use doesnβt require payment or permission. Authors whose books train AI models get nothingβnot a cent, not a credit, not even notification.
Impossible to Opt Out
How do you prevent your work from being used if piracy is legal? You canβt. Once your book exists in digital form, it can be scraped, copied, and processed without recourse.
The Value of Creation Diminishes
Why pay for content when you can train AI on pirated copies? The economic incentive for creating original work erodes when AI companies can legally use everything for free.
The Privacy Angle: Your Data, Their Model
This case has profound privacy implications beyond copyright:
Data Provenance Is Impossible
If AI companies can use pirated content, tracking whatβs in a model becomes impossible. Was your personal blog post in the training data? Your emails? Your medical records? Youβll never know.
Consent Becomes Meaningless
Privacy laws increasingly emphasize consentβyou should control how your data is used. But if AI training is fair use, your consent is irrelevant. They can use it anyway.
AI Models as Derivative of You
When an AI model learns from your writing, it captures something of your style, your knowledge, your voice. That model is, in a sense, derived from you. Metaβs argument says they own that derivative without owing you anything.
What Other AI Companies Are Doing
Meta isnβt alone in using questionable training data, but approaches vary:
OpenAI has faced similar lawsuits (including from the New York Times) and has generally settled or sought licensing deals rather than making the aggressive fair use argument.
Anthropic (maker of Claude) has been more cautious about training data sources and has emphasized constitutional AI approaches that try to minimize harm.
Google has extensive licensing agreements for some content categories but has also faced criticism for data scraping practices.
Stability AI has been sued over image generation models trained on copyrighted artwork, with similar fair use defenses.
Metaβs argument is the most aggressiveβessentially asking courts to legalize piracy for AI training across the board.
What Happens If Meta Wins
A Meta victory would:
- Legitimize AI data laundering: Pirate content, process it through AI, claim the output is βtransformativeβ
- Kill licensing markets: Why pay for training data when piracy is free and legal?
- Accelerate the AI copyright crisis: Every creator becomes an unwilling, uncompensated AI trainer
- Set international precedent: Other countries may follow U.S. fair use interpretations
- Undermine trust in AI: Knowing models are built on piracy changes how we view them
What You Can Do
Support Author Organizations
Groups like the Authors Guild are fighting these cases. They need resources and visibility.
Contact Your Representatives
Copyright law is ultimately made by Congress. Let them know creatorsβ rights matter.
Be Selective About Platforms
Some AI companies are more ethical about training data than others. Your choices matter.
Understand Your Rights
Even if fair use is abused, other rights may apply. Privacy laws, terms of service, and contract law offer alternative protections.
Document Your Work
Proving your content was used in training data is hard. Maintain records of your publications for potential future claims.
The Bottom Line
Metaβs argumentβthat BitTorrent piracy is fair use for AI trainingβrepresents either a bold legal innovation or a brazen attempt to legalize theft at scale. The outcome will shape the future of AI development, creator rights, and digital privacy for decades.
The courts will decide whether βwe pirated it for AIβ is a valid defense. But regardless of the legal outcome, the ethical reality is clear: taking creatorsβ work without permission or compensation isnβt innovation. Itβs exploitation.
And dressing it up in transformative use language doesnβt change what it is.
Protecting your digital rights starts with understanding them. Follow My Privacy Blog for news and analysis on privacy, data rights, and AI ethics.



