The Shadow Side of AI Part 1: Creatives vs Big Tech
Despite the popular support behind a spate of high profile lawsuits, legal scholars temper expectations and paint an uncertain path ahead for creatives in their fight against AI and Big Tech.
In a world where free access to information arms the public with quick answers, you may have experienced friends who, after a few clicks on Google Search quickly become experts on a variety of topics. As a technologist and writer by trade, I prefer to stay in my lane and defer to domain experts when my knowledge fails. I have over the last two weeks listened to and read papers from various legal scholars and creatives to understand the legal battles creatives are currently waging against Big Tech. The unprecedented adoption of Generative AI has already started impacting the work and livelihoods of many in the creative world. The disruption of markets is happening right now, not in some hypothetical future. I was surprised by some of the details I discovered, details that I’m sure will also be of interest to you.
The piece below summarizes the important aspects of my findings and provides an aerial perspective on the topic, bringing together various important sources to provide you with the big picture. There is a lot of information to digest, and I have included links to source materials if you wish to dive in deeper. This is both a complex and emotive topic, one that at times poses more questions than answers. Writing this piece has been a learning experience, at times shifting and widening my stance on how this technology is impacting the lives and hopes of real people.
What is Generative AI?
“Generative artificial intelligence (also called Generative AI or GenAI) is a genre of artificial intelligence capable of generating text, images, or other media, using generative models. Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics.”1
Generative AI tools are being aggressively marketed as the next big thing in tech. The hype has reached giddy levels. AI has even been feted by CEOs like Sam Altman of OpenAI as a panacea for some of the world’s most intractable problems, such as climate change, or even as a tool that could find a cure for cancer.2 Like his former colleague Elon Musk he isn’t shy to say outlandish things to keep investors interested. Altman started OpenAI with Musk as a non-profit venture. Keen to promote an altruistic image, he claimed in an interview a wish to “develop a human positive AI…freely owned by the world.”3 Unsurprisingly, it didn’t take him long to be tempted back to his old God Plutus and return to the profit motive.
It is not since the dot com boom of the 1990s that I have witnessed such wild claims flying around a new technology. Bolstered by this manufactured excitement, AI companies have attracted huge capital investment, and the tech is receiving unprecedented interest and attention from both the public and politicians around the world. Personally, I am not convinced by most of this hyperbole. One thing that is real though, is the potential harm that this set of products marketed as “artificial intelligence” can roll out to humanity. Harms that CEOs like Sam Altman often attest to. Harms, he is not shy to admit, often keep his executives awake at night. I will be unpacking today, one real set of harms that is unfolding in real time: the damage it is doing to the working lives of creatives across the world.
What tools are we talking about?
To give you an idea of the scope of this technology, here’s a list showing the types of Generative AI currently available with a small sample of current products implementing it:
Text Generation (OpenAI’s ChatGPT, Microsoft’s Bing Chat, Google’s Bard, Meta’s LLaMA)
Image Generation (Midjourney, OpenAI’s Dall-e, Stable Diffusion, Adobe Firefly)
Voice Generation (Eleven Labs, Play.ht, Speechify)
Video Generation (InVideo, Synthesia, RunwayML)
Music Generation (OpenAI’s Jukebox, Mubert)
Computer Code Generation (GitHub Copilot, OpenAI’s ChatGPT, Google’s PaLM 2)
In addition to the examples shown, you can find a myriad of custom applications that are starting to embed Generative AI functionality within them. Examples include market leading software such as Microsoft Office products like Word, social media apps like Facebook or Instagram, and image editing software like Photoshop. These efforts are starting to make this tech ubiquitous as it saturates the market and extends to an ever-increasing audience. With applications appearing daily it is difficult to keep up with new developments.
Why are creatives up in arms?
Brief answer:
The works and intellectual properties of human creatives have been taken without permission to provide the raw materials needed to create the models that Generative AI systems are based on. Generative AI systems then use these models to produce derivative works that often directly compete with the original artists. It’s a double whammy.
Long answer:
Generative AI systems are only as good as the data fed into them. The deep learning models used in these systems are created by algorithmically analyzing large amounts of original human works. Digital copies of these creative works are collected in large datasets to provide the raw material needed. In a process that is euphemistically called ‘training’ or ‘machine learning’, algorithms generate complex multidimensional mathematical data structures that map patterns, correlations, and other qualities found in the digitized source works. The model generation process is an expensive and resource hungry task that relies on the brute force of vast arrays of supercomputers and workstations running in parallel via cloud-based services. It bears no resemblance to the human activities of learning or training we all have first-hand experience of.
The source datasets used to create these models are shrouded in secrecy. Tech companies have been reluctant to give any clear details. Many datasets have been built by freely scraping content, including copyrighted works, from online sources without the consent, compensation, or credit of the artists that originally created them. In many cases, creatives first discovered that their works had been used after querying the Generative AI systems themselves and finding clear evidence from the output produced.
One thing is certain, without the input of quality human works, these Generative AI systems would be worthless. The derivative artworks produced by Generative AI often take on the characteristics of the works they were trained on. As synthetic derivative works start to dilute consumer markets, they cheapen existing human works and often directly compete with the human creatives whose works made these systems possible in the first place. By devaluing the works of living artists these Generative AI systems are potentially not only damaging to the livelihoods of creatives but also threaten their ability to continue creating unique human works.
What types of creatives are impacted?
More than you might imagine. In my research I have seen feedback from authors, copy writers, screen writers, graphic designers, concept artists, illustrators, fine artists, photographers, fashion models, makeup artists and hair stylists, computer programmers, publishers, musicians, voice actors, singers. As this tech develops further, there will certainly be more sectors you could add to this list.
Fashion models?
Yes, that one surprised me. Like many creatives, fashion models often work on a contractual basis and give up the rights for the images taken of them. Many models are now being required to undertake full 3D body scans as part of their work, sometimes even in the nude.4 These scans along with the images taken can then be used to create future works using the likeness of the model without the participation and compensation of the model. As more models are generated virtually using AI, supporting professionals such as makeup artists and hair stylists are likewise dispensable. To give an example of this, Levi’s has started experimenting with AI generated models to signal diversity in its marketing efforts.5
Can you give more examples of how the livelihoods of creators are being affected?
The quality of works currently produced with Generative AI tools is not always great, but the speed and volume at which they can be created is enough to disrupt marketplaces. Here’s a few examples to consider:
Artists, illustrators, graphic designers, and photographers have found their work devalued as the use of AI image generators has saturated the market with synthetic images. To understand the scale of this problem, in one year, 15 billion images have been created using Generative AI, a feat that took photographers 150 years to achieve.6
As reported by the Authors Guild “Numerous AI-generated books that have been posted on Amazon that attempt to pass themselves off as human-generated and seek to profit off a human author’s hard-earned reputation “7
A famous example includes the attempts made to write the concluding books in George R.R. Martin’s Song of Fire and Ice Series, using his characters, plot-lines, settings, and voice.8
Celebrated author Jane Friedman found ‘a cache of garbage books’ written using Generative AI under her name for sale on Amazon.9
Recent survey by Authors Guild found 69% of authors feel that Generative AI systems threaten their careers and 90% feel they should be compensated if their works are used in training.10
To slow the deluge of AI generated books, Amazon has recently reduced the number of books that can be published on its publishing site to three a day.11
Old-school publishing companies such as Clarks World Magazine are becoming overwhelmed with AI generated book submissions.12
Voice actors report having their voices cloned and used without permission. The rise of Voice AI products provides cheap alternatives for their services.13
Deepfakes are starting to proliferate online using the likeness of celebrities and artists for commercial purposes. A recent advertisement on Tiktok used the likeness of top YouTube creative Mr. Beast without his permission.14
These examples provide a sample of the issues faced by creatives. As an emerging technology, new ways to profit from these tools are being discovered daily. As hinted at by the current Chair of the Federal Trade Commission, Lena Khan, Generative AI systems provide a huge potential for bad actors to “turbo charge fraud”.15
What are creatives asking for?
In brief creatives want:
Consent
To gain control over if and how their works are used by AI systems. Many are demanding that their works are by default opted-out of being ingested by Generative AI systems and only used when they explicitly opt-in.Compensation
To receive fair payment when their works are used by AI systems.Transparency
To gain clear visibility of works used in the AI training process.Attribution
To receive credit for the human contributions to these systems.Humans first
To favor human works and ensure no copyright status is granted to synthetic content.
What are the legal weapons available to creatives?
Copyright – many lawsuits claim illegal use of original works breaches copyright.
Privacy – AI outputs can reveal private details found in source documents.
Defamation – Synthetic content can use both the likeness of creatives and their works in unauthorized ways that could harm the reputation of the human creative.
Breach of contract – content within websites often contain terms of use and licenses such as creative commons attribution that AI systems ignore.
New Legislation and regulation – If there are gaps in existing legislation then creatives could lobby for new laws and regulations to fill them.
What cases are currently in motion?
Below I have listed the prominent cases filed in the US. Most cases focus on breach of copyright.
In November 2022, a class action lawsuit was filed against the creators of GitHub copilot, a Generative AI tool that produces computer code. The case has been filed by software developers against GitHub, its owner Microsoft, and OpenAI who provides key components of the tool.16
In January 2023, a class action on behalf of visual artists Sarah Andersen, Kelly McKernan, and Karla Ortiz was filed against Stability AI, Midjourney and Deviant Arts, all corporations that market Generative AI tools producing images.17
In February 2023, Getty Images filed a lawsuit against Stability AI, claiming that it had ingested 12 million of its copyrighted images. Getty Images, in its complaint is asking for a “full and complete accounting to Getty Images for Stability AI’s profits, gains, advantages, and the value of the business opportunities received from its infringing acts.”18
In June 2023, a class action lawsuit on behalf of authors Paul Tremblay and Mona Awad was filed against OpenAI, the creator of ChatGPT, a Generative AI tool that produces text.19
In July 2023, two class actions were filed against Meta and OpenAI respectively, by writers Richard Kadrey, Sarah Silverman, and Christopher Golden.20
In July 2023 a class action was filed against Google claiming its AI systems had violated privacy and copyright laws by scraping websites used to train products such as Bard.21
In September 2023, the Author’s Guild filed a class action lawsuit against OpenAI, representing authors including David Baldacci, Mary Bly, John Grisham, George R.R. Martin, and Jodi Picoult. The suit claims amongst other charges that OpenAI’s ChatGPT had violated the copyright of said author’s work.22
What do we need to understand about copyright claims?
Here’s a few important features of copyright law in the US:
Under US law “Original works of authorship”, are provided copyright protection. Copyright was created to protect the livelihood of innovators and creatives and give them the incentive to continue creating and promoting their work.
In the US copyright falls under federal law.
Federal cases are expensive (we are talking at least 6 figures to execute legal challenges in the federal arena).
The legal process is slow. With the current cases, don’t expect any judgements before 2024 or 2025.
Judgements are determined on a case-by-case basis, and multiple cases could produce conflicting outcomes that cloud the situation.
Copyright laws are territorial, there is no global consensus. Copyright judgements will not provide a universal solution, different laws apply in different countries.
What do the tech companies have to say on this topic?
AI companies such as OpenAI are claiming that their use of training data scraped from the internet constitutes fair use. Fair use is a mechanism, under US law, that allows use of copyrighted material, within certain limits, without the need for permission from the copyright holder.
Associated press reported that Stability AI, the creators of the Stable Diffusion model (an AI image generator), had this to say in response to a lawsuit:
“anyone that believes that this isn't fair use does not understand the technology and misunderstands the law."23
The same article quoted David Holz, CEO of Midjourney as commenting:
“Can a person look at somebody else’s picture and learn from it and make a similar picture?” Holz said. “Obviously, it’s allowed for people and if it wasn’t, then it would destroy the whole professional art industry, probably the nonprofessional industry too. To the extent that AIs are learning like people, it’s sort of the same thing and if the images come out differently then it seems like it’s fine.” 24
OpenAI the creators of ChatGPT and Dalle have consistently claimed their use of copyrighted works constitutes fair use. For a more detailed analysis of why they believe this to be the case, you can refer to a detailed document they submitted to the United States Patent and Trademark Office, that contains their main arguments.25
How is Fair use determined?
Fair use looks at
Factor 1: The Purpose and Character of the Use
Factor 2: The Nature of the Copyrighted Work
Factor 3: The Amount or Substantiality of the Portion Used
Factor 4: The Effect of the Use on the Potential Market for or Value of the Work
Harvard Law School Lecturer Jessica Fjed commented on these factors and said:
“It's not a counting exercise where if you win three of them and lose one, you automatically win. You could have just one strong factor, but have it be so strong that you still win.”26
What helps the case for creatives?
Without quality training datasets the Generative AI systems are useless.
Petitioners claim that copyrighted works have been scraped from ‘shadow libraries’ online, including pirate websites like Library Genesis (aka LibGen) and Z-Library (aka Bok).
There have also been claims that copyrighted materials have been ‘laundered’ through non-profit organizations, under the pretext of performing educational research. In what might be interpreted as maneuvers to circumvent copyright, such ‘research’ data is later used to fuel AI companies’ for-profit models.27 This is reminiscent of the strategies many corporations pursue when shuffling money through offshore shell companies to minimize tax liability.
Evidence of loss of earnings may support arguments that the technology devalues their work (factor 4 above).
What helps the case for Big Tech?
Unlimited financial resources gives them a strong advantage
Legal Precedents seem to support some of their arguments (see below)
Close partnerships with, and the ability to directly lobby both state agencies and state actors give many big tech companies a clear advantage when shaping legislation and regulations.
Relevant past copyright cases:
The following precedents are relevant to the current court cases. It is worth noting the timescales shown to gain realistic expectations how cases currently in motion may proceed.
Perfect 10 v. Google (2006-2007)
In 2006 Google won a lawsuit from Perfect 10, an adult entertainment publisher. Perfect 10 argued that Google’s creation of thumbnail copies of its images within search results constituted a breach of copyright. The court found in favor of Google and ruled Google’s transformative use of the original images constituted fair use.28
Authors Guild v. Google (2005-2016)
In 2013 Google won a landmark case, that ruled the corporation’s digitization of millions of books and the creation of search functionality that allowed snippets of them to be freely available online, constituted fair use. Judge Denny Chin wrote in his ruling:
“In my view, Google Books provides significant public benefits. It advances the progress of the arts and sciences, while maintaining respectful consideration for the rights of authors and other creative individuals, and without adversely impacting the rights of copyright holders.”
Subsequent attempts to overrule this judgment failed, up to the final dismissal by the Supreme Court on 18th April 2016.29
Authors Guild v. HathiTrust (2012-2014)
In 2012, a Federal Court found that HathiTrust Digital Library’s use of copyrighted content constituted fair use in a similar case to the one above. The HathiTrust Digital Library was a spin-off of the Google Books project. The initial judgment was upheld with minor amendments in a Second Circuit appeal in 2014.30
Andy Warhol Foundation v. Lynn Goldsmith (2019-2023)
A recent copyright case resolved in the Supreme Court found that the “Prince Series” created by Andy Warhol was not transformative enough to constitute fair use, as it directly competed with the reference image by Lynn Goldsmith. The case went through a number of appeals until it was finally resolved by the Supreme Court. Justice Sonia Sotomayor, concluded,
“The use of a copyrighted work may nevertheless be fair if, among other things, the use has a purpose and character that is sufficiently distinct from the original. In this case, however, Goldsmith’s original photograph of Prince, and AWF’s copying use of that photograph in an image licensed to a special edition magazine devoted to Prince, share substantially the same purpose, and the use is of a commercial nature. AWF has offered no other persuasive justification for its unauthorized use of the photograph.”31
What do the legal experts say?
In a recent paper ‘Art and the science of generative AI’ in the journal Science, a group of academics, including legal scholars had this to say:
“Much of copyright law relies on judicial interpretations, so it is not yet clear if collecting third-party data for training or mimicking an artist's style would violate copyright. Legal and technical issues are entwined. Do models directly copy elements from the training data or produce entirely new works? Even when models do directly copy from existing works. It is not clear whether and how artist's individual styles should be protected.”32
As Patrick Goold, reader in law at City University in the UK, commenting on the Getty Images lawsuit, told BBC News
"For hundreds of years, human artists learned by copying the art of their predecessors. Furthermore, at no point in history has the law sanctioned artists for copying merely an artistic style."33
Commenting on the lawsuit presented by the Author’s Guild he later told BBC News:
“While he could sympathize with the authors behind the lawsuit, he believed it was unlikely it would succeed, saying they would initially need to prove ChatGPT had copied and duplicated their work.
When we're talking about AI automation and replacing human labour...it's just not something that copyright should fix…What we need to be doing is going to Parliament and Congress and talking about how AI is going to displace the creative arts and what we need to do about that in the future."34
In a recent YouTube stream from DigitalFUTURES, Lecturer of law at Harvard Law School Jessica Fjeld commented:
“I've been interested in the copyright questions around art and AI for a long time. But when we're thinking about it really in terms of the livelihoods of creators, how artists are able to get paid for the work that they do, it is hard for me to see copyright as a decent solution…if we value having artists, we probably need to make direct investments in them, rather than expect them to bring and follow through on and recover through copyright schemes…I don't think, in spite of the many very confident talking heads out there, that anyone knows how these cases are likely to come out, fair use is littered with surprising decisions, some of which were reversed, and others of which have stood the test of time. And I think that that sort of underlines why copyright is such a risky mechanism, when we want to think about how to take our artists and creatives through a time that is going to be challenging for them and for their industries and their livelihoods .”35
And finally FTC Commissioner Rebecca Slaughter, in a recent roundtable commented:
“Copyright is not and cannot be the only tool to address the deeply personal concerns creators hold about how their works are used.“36
Where does this leave creatives?
The powers of Generative AI are greatly overstated and surrounded by hype. The harms resulting from their use however, are quite real and are starting to impact the lives of many working people. With several court cases in motion, debating the rights and wrongs of each side of the argument won’t get you very far. Patience is required to allow the lawyers to do their work and let these cases take their course. Even if judgements rule in favor of creatives, in real terms this might not translate to any great benefits for individuals. We need only look at the plight of musicians and the minuscule royalties they receive from streaming giants such as Spotify to imagine something similar happening in the case of Generative AI. With systems that process billions of images and huge amounts of text, an equitable distribution of royalties is unlikely to provide anything even resembling a good livelihood for most people. Additionally, if big tech loses their legal arguments, there is nothing preventing them from simply shifting to jurisdictions with more favorable laws or regulations that will permit them to continue their operations unabated.
Stop the bots
In the meantime, creatives might want to shield their works from AI systems as much as they can. Deleting works from online spaces is not an option for most, as visibility is required to garner any kind of support. The ability to block content from AI training bots, now implemented by sites like Substack and Medium are welcome moves. I do think this protection should be offered by default, and not require the free labor of creatives to implement, however easy that might be to do (CEO of Substack,
, any thoughts here?). As the number of AI systems grow, the idea that creatives will need to manually opt out of each of them seems quite frankly ridiculous. Unfortunately, even with these protections in place, there will always be rogue AI bots that could simply ignore such preferences and scrape the content anyway.Big tech responses
Recent moves by some of the big players in AI suggest a shift in sentiment from the side of the tech bros. With the release of OpenAI’s image generator Dalle-3, a change in policy from OpenAI is evident in the fact that it has stopped allowing end-users to use the names of living artists within their prompts. A decision that perhaps reflects the influence of ethical if not legal arguments made by creatives. Whilst that concession by OpenAI may seem like a small win for creatives, if we look to some of their peers in the AI space we uncover a different story. There are signs that many AI companies are feeling confident they can win any legal challenges thrown at them. Microsoft, Adobe, and Google have all recently announced that all users of their Generative AI products will receive protection from any copyright claims arising from the use of their products.37
Developments across the pond
Compared to the US where the regulatory environment is more laissez-faire, if we look to the European Union, new laws to regulate AI are close to implementation. As reported by The Guardian in an important development, a new EU AI act will issue new requirements for tech companies:
“AI companies will have to submit lists of data sources to the European Commission as part of a regular reporting requirement, which [steering committee member] Tudorache hoped would act as a deterrent to the use of data and creative content without recompense.”38
The law may come in to play as early as next year, hopefully forcing companies to be more transparent about the data used to train their systems. We shall have to wait and see how AI companies, who have been screaming at governments to regulate them respond to actual laws that impact their operations.
How we value art
If AI corporations win their court battles, perhaps hiding human creative content behind secure paywalls is the only way forward for creatives. As suggested by legal scholar Jessica Fjeld above, for that option to work, we may need to reassess as a society how we value human art and creation. Those who have resources to spare may need to start giving more direct financial support and patronage to both new and established creatives. In some countries support is offered by the State. Platforms like Substack and Patreon, imperfect as they are, may play a role in such efforts.
Final thoughts
FTC Commissioner Rebecca Slaughter, alumna of Yale Law School, commented at a recent roundtable of creatives:
“Art is fundamentally human…we cannot lose sight of the fundamental truth that technology is tool to be used by humans, humans are not and should not be used by technology.”39
At that same roundtable, director of policy and advocacy at the Authors Guild, Umair Kazi said
“Do we really want a world where our books, literature, and art are algorithmically synthesized mimicries of the richness of human experience?”40
I’ll leave the last comment to Authors Guild CEO Mary Rasenberger:
“Regurgitated culture is no replacement for human art.”41
The System Reboot relies on the support and encouragement of readers like you. Substack's algorithms favor posts that are engaged with by readers. Small acts of kindness make a big difference to new authors and small publications like this one. Like, share, restack, recommend, or comment if you find value in my writing. To automatically receive updates and new posts, consider becoming a subscriber. All words are hand crafted without the assistance of AI.
See FTC roundtable for testimony https://www.ftc.gov/media/creative-economy-generative-ai-discussion-october-4-2023
For testimony see FTC roundtable at https://www.ftc.gov/media/creative-economy-generative-ai-discussion-october-4-2023
See talk presented by DigitalFUTURES and Prof. Neil Leach on https://www.youtube.com/watch?v=1WhaAZoSc0k
See talk presented by DigitalFUTURES and Prof. Neil Leach at https://www.youtube.com/watch?v=1WhaAZoSc0k
See https://arstechnica.com/information-technology/2023/09/microsoft-offers-legal-protection-for-ai-copyright-infringement-challenges/ , https://www.fastcompany.com/90906560/adobe-feels-so-confident-its-firefly-generative-ai-wont-breach-copyright-itll-cover-your-legal-bills , and https://arstechnica.com/information-technology/2023/10/google-will-shield-ai-users-from-copyright-challenges-within-limits/
Thank you for this informative look into what is quickly becoming one of the more important topics of our time. As someone who has been involved in the music industry for some time this , along with the problems coming from streaming, could render a death knell for many upcoming artists.
You big effort is appreciated.