Patrons: click here to disable ads.

Black Library Writers Respond to Meta Scraping Their Work

Spend any time in the corners of social media where writers tend to congregate, and there’s a pretty good chance last Thursday you noticed the dialogue around AI, intellectual property, and Meta (the company behind Facebook) explode into view.

As far as tech issues go, the broad strokes of this one are refreshingly straightforward.

Meta has been looking to develop its own AI software, Llama3, something that could compete with ChatGPT, Grok, and the like. Of course, in order to refine the product and make sure it was able to communicate clearly and with a high degree of quality, they needed to feed it things where our best-quality writing tends to reside: books, and in massive quantities.

Building these libraries of training data is a common problem for AI companies, where a recent MIT study found that over the past year new data restrictions have dramatically reduced the amount of high-quality data sources which these companies can access. Acquiring this data legally (and ethically) would require purchasing licenses, a prospect which Meta’s own recently released internal memos noted would be expensive and time-consuming. Additionally, licensing one such work means licensing every work they’d want to use, removing any such “fair use” defense they may want to invoke later.

Alternatively, you could save all that time and money and simply pirate them instead. No points for guessing which path the fine folk at Meta opted for. They torrented a library collection called LibGen, itself a pirated repository of 7.5 million books and more than 8 million research papers. ¡Buen provecho, Llama!

This is hardly the first time that Meta has been in the news for playing fast and loose with other people’s data. Last December, Ireland fined Meta $263 million dollars for data security failures which violated Europe’s General Data Protection Regulation (GDPR). Earlier that same year, Meta was accused of “massive and illegal” data collection practices. The European Consumer Organization allege that Meta illegally collects excessive, unnecessary data on its product users without their consent and in breach of, again, the GDPR. And lest anyone think that perhaps these are just a regional regulation issue, South Korea this past November gave Meta $15.67 million reasons to stop passing sensitive user data to its advertising partners in the absence of a legal basis.

These are just a few very recent examples, but suffice it to say that Meta has a long history of being cavalier with other people’s information, and the latest revelation about training its AI on pirated creative works seems perfectly consistent with the contempt Meta seems to hold for consent and fair compensation for the data and information of others.

In addition, the concerns about affordability for licensing these creative works strains credulity when you recall that this is the seventh-largest company in the world with a $1.5 trillion market cap. It quite happily set $46 billion on fire chasing the vision of the “Metaverse,” while throwing even more at AR and VR development.

It’s not that they can’t pay to license copyrighted works, or even that they won’t. It’s behavior consistent with an entity that feels it simply doesn’t have to.

For our Black Library authors, writing for established IPs means they’re in something of a limbo, watching their works get devoured to feed the AI beast while having to hope that the copyright-holders pursue action to enforce their rights. By the same token, most of them also write completely original works that they could seek to enforce copyright on, but that itself remains an expensive proposition with questionable hope of practical effect.

Enter Alex Reisner, a programmer and contributing writer for the Atlantic with a focus on artificial intelligence. When court papers were released laying out what had happened, he obtained a snapshot of LibGen’s metadata and created a searchable database of what books and papers the pirate library contained. The story — and the searchable database — broke last Thursday, and it spread across the literary world faster than ink takes to dry.

BlueSky — my home and the home for a surprising number of Black Library creatives — held no shortage of reaction.

Jonathan D. Beer, author of Dominion Genesis and King of the Spoil.

Image credit: Jonathan D. Beer

Juliet McKenna, whose short story “Fear Itself” was part of the Warhammer 40K Fear the Alien Anthology.

Image credit: Julia E McKenna

Robert Rath, author of The Infinite and the Divine, The Fall of Cadia, and Assassinorum: Kingmaker.

Image credit: Robert Rath

Nathan Long, author of five Gotrek & Felix novels, the Blackhearts trilogy, and the Vampire Ulrika trilogy.

Image credit: Nathan Long

Danie Ware, author of The Rose in Darkness, The Triumph of St. Katherine, and a score of other short stories and novellas.

Image credit: Danie Ware

And that was just for starters!

We spoke with six past and/or current Black Library writers to get their thoughts, and it doesn’t make for easy reading. It’s not supposed to, because in many ways this shares the same deep-rooted anxiety that goes beyond art and into the craft, feelings we once associated with factory workers losing their jobs to robots.

Amanda Bridgeman: A Scribe Award winner, Amanda has written for Marvel and Pandemic, and her sci-fi mystery series Salvation is currently in TV development by Aquarius Films and Anonymous Content (True Detective, Mr. Robot, The Alienist). For the Black Library, her story Reconsecration featured in 2022’s Inferno Presents: The Emperor’s Finest.

Mike Brooks: Mike is the author of nearly a dozen novels or novellas for the Black Library, including notable works Alpharius: Head of the Hydra, The Lion: Son of the Forest, Da Big Dakka, and Lelith Hesperax. He also plays guitar and sings in the punk band Interplanetary Trash Talk.

Nicholas Kaufman: A Bram Stoker Award finalist (for General Slocum’s Gold), Nicholas has written Chasing the Dragon, The Hungry Earth, and 100 Fathoms Below (with Steven L. Kent). A writer of the suspenseful and macabre, his story The Child Foretold appeared in the Warhammer Horror anthology The Accursed in 2019.

Juliet McKenna: As noted above, Juliet contributed the short story Fear Itself for the Fear the Alien anthology. She is also a BFSA Award-winner for 2023’s The Green Man’s Quarry, and contributed the story Civil War to the 2024 BFSA Award finalist anthology Fight Like a Girl, Vol. 2

Josh Reynolds: Author of more than a score of novels and novellas for the Black Library (not to mention many more short stories), Josh has written just about everything: 40K, Age of Sigmar, Chronicles, Horus Heresy, even Necromunda. He is the pen behind books like the Fabius Bile Trilogy, Fulgrim: The Palatine Phoenix, and Plague Garden.

Adrian Tchaikovsky: Acclaimed for his Shadows of the Apt series and Hugo Award-winning Children of Time series, Tchaikovsky’s work for the Black Library includes Day of Ascension, On the Shoulders of Giants, and the story The Long and Hungry Road.

 

On how they felt learning their works were part of the pirated LibGen library used to train Meta’s Llama3…

Nicholas Kaufman

“Even before this, I was infuriated with pirate book sites on the Internet. Those sites are career-killers, especially for us midlist, lesser-known authors who really need every sale we can get so we have a track record to show when we’re trying to find publishers for our next books. It makes me wonder just how many sales I’ve lost out on because of pirating, how much money that should have come to me and my family was essentially stolen, and how much my career may have suffered for it.

“But this situation with Meta scraping books from pirate site LibGen takes things to a whole other level.”

Mike Brooks

“[I’m] very angry! It’s being framed in a bunch of ways by those responsible, but the primary one is basically, ‘we needed this in order to make our product work and we couldn’t afford to pay everyone,’ which is bullshit. Any government that’s prepared to accept that logic had better also accept anyone on a low income just wandering into a shop and helping themselves to what they need, even if they can’t afford to pay for it.

“Otherwise it’s communism for the corporations and capitalism for the individual.”

Juliet McKenna

“Quick answer? Unprintable. But I am not surprised. Since digital piracy first appeared, the idea that individual authors must be responsible for identifying and preventing illegal copying of their work has been unrealistic and unworkable. I have sent numerous DMCA takedown notices which have been ignored, and I don’t have hundreds of thousands of pounds to spare to take anyone to court.

“The thing is though, this isn’t a case of digital pirates making my work available for free with the same tired old excuses. Meta are using my work, and that of thousands of others, to create something to make themselves rich. They seem to think they can use the economic model of a school cake sale. Other people provide the essential ingredients and do the work, bearing those costs, while the people who sell the end product keep the cash. No. I am not their mum, they are not raising cash for new gym equipment, and I have not agreed to take part!

“This is not how ethical businesses operate. There’s a copyright notice in the front of all my publications. All rights reserved. That means if anyone wants to use my work for a commercial purpose, they have to ask for permission. I decide if I’m going to let them, and how much I expect to be paid. Meta knows this and they have chosen to ignore the law. There have to be consequences, by which I mean fines and compensation substantial enough to make these mega-corporations pay attention.”

Adrian Tchaikovsky

“I think we all knew this had been going on, and the AI companies have recently just started saying outright that they can’t make money unless they can have our stuff for free to train their – what? Spicy autocorrect? Plaguarism engines? I really don’t want to call this LLM business “AI” because there’s no “I” involved.”

Josh Reynolds

“I’d be happy to share my thoughts, but they’re mostly unprintable.”

 

On the potential impacts to the industry, and what they’d have preferred to have seen…

Adrian Tchaikovsky

“One of the most vexing things about this business is that it’s so oversold, and that it’s likely putting real AI research back decades because of the stink it’ll leave on it. It is enormously frustrating that all of my hard work and that of so many others has basically been shoved into a grinder to train these rather sad recombination devices. It’s an attempt to expunge the human from art, so that there’s just product.

“It’s not democratisation and it’s not to help the less abled, it’s pillaging for profit and nothing more noble or interesting than that. They’ve just found a way to dress up the theft so it looks like computer magic to investors dumb enough to fall for it.”

Mike Brooks

“Personally, I’d prefer to see zero engagement with generative AI: AI use for things like medicine, weather forecasting , et cetera, where it’s high-volume pattern recognition, is something quite different (but also needs human oversight and final sign-off, I have no use for a world where AI decides whether you qualify for medical insurance, and so on).

“However, generative AI is simply predictive text with pretensions, and also massively environmentally damaging. So I’d like to see it disappear on that front.”

Amanda Bridgeman

“I’d prefer zero engagement with the training of AI for creative work.

“Now that it’s taken place without my consent, I realise it’s something that cannot be undone because it’s already out there, therefore I would now seek compensation. And I’m not talking a nominal amount, either. I’m talking consequences for theft, and potential loss of future earnings.

“I’m not against AI. I think it could benefit society in a lot of ways, particularly with regard to science and medicine. I’m all for AI diagnosing cancer or dementia must faster, or using it to predict earthquakes, tornadoes or tsunamis earlier to save lives, etc. But using it to create fictional works is just absolutely terrible and completely unnecessary.”

Josh Reynolds

“If I was compensated for this sort of thing, I’d have fewer issues with my work being fed into the sausage grinder. As it is, it’s simply theft. Pay me, motherfuckers, or don’t use my stuff for your slop-bot.”

Nicholas Kaufman

“This is a multibillion-dollar corporation deciding to break copyright law for their own selfish purposes–to create an AI model that they can then use to attract additional investors and possibly sell, all to make even more money- and now they’re arguing in court that they should be allowed to break copyright law because it was faster and easier for them that way.

“They have enough money that they could have bought one copy of the greatest works of literature from around the globe to train their AI on, but no. Why buy anything when you can just steal it? Infuriating doesn’t begin to cover it.”

Adrian Tchaikovsky

“I’d prefer zero engagement because these engines are not merely eating our work for free, they’re designed to replace the human artist with a fire hose of regurgitated slop, incidentally destroying the environment as well. A negotiated license deal, with appropriate recompense, should be the least the law should allow, but they’ve already admitted that, if they had to pay any kind of fair compensation, they wouldn’t make all that money they’re after.

“In the same way that, if I could legally take your car and sell it on, then I’d have the world’s most profitable car dealership.”

Juliet McKenna

“Personally, I have no wish to engage with Generative-AI. What Meta and others are doing is about as far from creating expert systems – the useful sort of AI – as it’s possible to get. Those expert systems, for medical and technical uses, apply machine learning to carefully selected and specific datasets. Their output is reviewed and refined by human experts. That human involvement is essential.

“Human input is essential for true creativity. A writer understands the characters and themes they’re working with. They can analyse the ways similar narratives have been told before. They find the best ways to explore their ideas in novels, novellas or short stories. They look for new angles and reflections on what’s being written across different genres. This is where original stories come from. Musicians and artists work in similar ways.

“Generative-AI can do none of these things. It cannot think. It cannot learn. It cannot comprehend any of the thought processes that went into creating what’s been fed into it.  All it can do is look for predictable patterns like some glorified auto-correct, and spew out unoriginal slop.

“If writers do want to license their work? That’s their call. They have that right under existing law, and that law says they should be paid.”

Mike Brooks

“So long as we’re in a society where big money can do what it wants and governments bow to the will of billionaires and corporations, then it probably is inevitable: again, until people clock this con like they did NFTs and (increasingly) crypto before it.

“The issue is not that I think generative AI can replace the work of a human, because it can’t. The issue is that it can probably hit a balance of being just good enough to convince the people who hold the purse strings for big companies that it’s way more cost-effective than actually hiring people. Either that, or unrealistic work expectations will be placed on humans which can only be fulfilled, volume-wise, by using AI.

“And that’s not just authors, that’s script writers, editors, artists, composers. There have already been multiple instances of publishers large and small using AI-generated cover art instead of commissioning a human artist, and at least one author I know is convinced that an edit pass on their novel was done by AI, not to mention many publishing outlets reporting a massive increase in obviously AI-authored work being submitted.

“So what we’ll see is an ongoing enshittification of mainstream media as more and more work gets farmed out to produce bland retreads of stuff that’s come before, leaving actual humans producing actual work struggling to get noticed even more than they are now, and companies that take a stand having to be more and more certain that the money they’re forking out for real human work.”

 

Thoughts for the future? What can be done about it?

Amanda Bridgeman

“AI is definitely here to stay, so in some respects it’s inevitable. But, governments around the world need to urgently prioritise the regulation of AI, and punish the Forces of Chaos who use it for dastardly means (ripping off creative works, deep fakes, et cetera). May Karma bring about their swift downfall!”

Mike Brooks

“Sadly, [the British] government appears to be leaning very much towards “the creative has to specifically say that they don’t want their work used”, without any indication of what consequences AI companies will face if they ignore that statement, or what will be done to make sure the work will be extracted again (which I believe is basically impossible). So the AI companies will just go “Oops, our bad” and do as they please.”

Adrian Tchaikovsky

“Whilst this recent expose sets it out very plainly, plenty of writers had identified their work as being fed into the teeth of the machine. There are absolutely legal steps that can be taken, to undo some of the harm and redress the balance. In the UK, at least, the current love affair with this nonsense means they probably won’t be taken.”

Mike Brooks

“The only thing that can be done to mitigate it is, if a reader, consumer, viewer, et cetera wants to ensure they still get good quality stuff in the future, to boycott AI. Don’t buy anything with AI art, with AI writing, with an AI soundtrack. Don’t use ChatGPT, or MetaAI, or Deepseek, or anything like that. Even as a novelty.

“If we, as customers, don’t use it, it stops being financially viable, and the language of currency is the only one the developers understand. Movie companies have been trying to make 3D a thing for decades, but they keep backing off when it becomes apparent that no matter what they do, just not enough people are interested.”

Juliet McKenna

“The comparison with Tyranids stripping planets of biomass to create mindless, identical creatures that do nothing but destroy is scarily apt. If decision makers at big entertainment companies are suckered into believing Gen-AI can deliver a commercially viable novel/TV script/screenplay that will make them money without having to pay any writers? Writers will stop writing and find other work. It’s as simple as that. We have to pay our bills.”

Fortunately, there are things everyone can do to push back. Readers, viewers, gamers, everyone who values original, well-crafted stories in whatever format, tell those big entertainment companies that you do not want Gen-AI-slop. Refuse to engage with it. Refuse to pay for it. Seek out and support those publishers, game developers and studios which offer authentic human creativity. Let your friends and fellow fans know what you’re doing and why. You can also tell your elected representatives that global corporations don’t get to ignore laws just because they decide that’s inconvenient. If they get away with this, what laws will they ignore next?”

 

Have any questions or feedback? Drop us a note in the comments below or email us at contact@goonhammer.com. Want articles like this linked in your inbox every Monday morning? Sign up for our newsletter. And don’t forget that you can support us on Patreon for backer rewards like early video content, Administratum access, an ad-free experience on our website and more.

Patrons: click here to disable ads.
Patrons: click here to disable ads.