A Balanced Response to the Data Hunger of Artificial Intelligence
In the age of artificial intelligence (AI), creative content has become the new currency. From books to blog posts, poems to news articles, AI models rely heavily on human-generated content to learn language, context, and meaning. But this raises a fundamental question: What rights do authors have when their work is used to train these technologies?
While tech companies have often defaulted to a system that requires authors to “opt out” if they don’t want their content used, a new alternative is emerging — collective licensing. This approach is gaining momentum as a fairer, more practical way to ensure that creators are acknowledged and compensated for their contributions to AI development.
What is Collective Licensing?
Collective licensing is a rights management system that allows organizations to grant licenses for the use of creative works on behalf of a group of rights holders, such as authors, artists, or publishers. The license allows companies (like those developing AI) to legally use large volumes of content, while ensuring the creators receive a portion of the revenue or royalties generated.
Instead of each author having to negotiate individually or navigate complex opt-out procedures, a collective license streamlines the process by representing authors as a group. Revenue collected from licensing is distributed to the original creators according to pre-defined rules.
Why This Matters Now
The rise of generative AI tools like ChatGPT, Midjourney, and others has brought this issue to the forefront. These models are trained on massive datasets, often scraped from the web, including books, journalistic content, scripts, and more — frequently without the knowledge or consent of the original creators.
Without a transparent and fair framework, this practice poses ethical, legal, and economic risks:
- Ethically, it raises questions about digital exploitation.
- Legally, it sits in a grey area that may not protect authors adequately.
- Economically, it deprives writers and publishers of income from the use of their intellectual property.
Collective licensing offers a way out of this dilemma, without halting the progress of innovation.
How Collective Licensing Works
- Licensing Organizations
Rights societies or collecting agencies (such as those managing music or book rights) act on behalf of authors and publishers to grant licenses to AI companies. - Usage-Based Revenue
Tech firms pay licensing fees based on the volume and type of content used in AI training. - Revenue Distribution
Funds are then distributed to content creators through established royalty systems, often based on metrics like usage frequency, type of content, or registered ownership. - Opt-Out Flexibility
Importantly, authors can still choose to opt out of the system — but the default is participation, meaning fewer barriers for inclusion and more fairness in compensation.
A Shift from the Opt-Out Model
Until now, the default model proposed by many companies and governments has been opt-out:
- Creators must explicitly request to have their works excluded from datasets.
- In practice, this is difficult, time-consuming, and sometimes impossible, especially for older works or deceased authors.
Collective licensing inverts the logic:
- Participation is automatic unless otherwise stated.
- Creators are compensated without needing to navigate legal or technical loopholes.
- The system respects both intellectual property and the realities of data-driven innovation.
Industry Support and Implementation Timeline
This summer (2025), the first versions of collective licensing schemes for AI training are expected to go live. Early backers include:
- Major publishers and authors’ unions
- Copyright and licensing agencies in the UK and Europe
- Prominent writers and cultural organizations
The system is being designed with transparency, inclusivity, and enforceability in mind — creating a working model that can scale across languages, genres, and media types.
Challenges Ahead
No system is perfect, and collective licensing comes with its own hurdles:
- Enforcement: Will AI companies comply voluntarily, or will regulation be needed?
- Global Coordination: Cross-border licensing and jurisdictional differences complicate things.
- Fair Distribution: How do we track usage accurately and pay authors fairly?
Still, despite these complexities, the consensus is growing: doing nothing is no longer acceptable. The creative economy deserves its place in the AI ecosystem — not as invisible labor, but as a recognized and rewarded contributor.
As artificial intelligence continues to evolve, so must the systems that support and protect the people whose work makes it possible. Collective licensing offers a balanced, ethical, and scalable solution to one of the most urgent cultural questions of our time.
It ensures that authors — from novelists to journalists, poets to playwrights — are not left behind in the digital rush, but are fairly paid and properly acknowledged for the stories, ideas, and knowledge they bring into the world.