The AI bots are coming for your Substack
Do you want AI bots crawling through your Substack posts to eat up all your precious writing? Are you happy to let them repurpose your words as they please? Well, it has already started.
The thought of anyone copying your work and remixing it for their own benefit may leave you feeling slightly uncomfortable. But when this is done on an industrial scale by millions of automated crawlers around the web, our discomfort may quickly turn to rage. Read on to find out what I have discovered about your writing on Substack. If you have an opinion on this topic, please take the time to express it in the poll at the end of this article.
If this subject is important to you, please share it widely with writers who may also care.
The Age of AI
The AI craze is well and truly upon us. Many are so dazzled by what this technology can do they don't have time to think much how these systems are doing it, or even if they should be doing it at all. New tech like this can be scary for many people. These generative AI systems have created an unprecedented amount of both hype and fear. Fear from the engineers creating the systems. Fear from normal workers trying to hold on to their jobs, as they struggle to keep up with it. There is more fear being built up about this technology than there was around the year 2000 issue that some believed would bring on a computer induced apocalypse. Remember that?
Walls of secrecy surround the sources of data that Generative AI systems are trained on. I think we can all understand the degree of outrage voiced by working artists as images replicating their unique styles started to proliferate freely online. Is it fair that their unique works were scraped to train these systems? Without permission. Without a share of royalties. In fact, without any license being negotiated at all? Or is this simply dog-eat-dog capitalism that we should all passively accept and defer obsequiously to? Should we simply submit to these tech billionaires as they fly around in their rockets or relax on their super yachts?
Substack enters my world
In my first 12 days here on Substack I have been analyzing how it all works. Finding my feet you could say. I like to dig as deep as I can. With a systems design background and a dollop of curiosity I enjoy taking things apart to see how they tick. I quickly realized that as a new Substack writer I needed to engage with the community to gain any traction at all. So, I reached out to a few people whose writing touched me and started asking questions. I think I have discovered something that may interest you all.
My first question was: how are new and unknown writers treated in this space? I was curious how a writer's visibility changes online as they gain an audience. How does Substack assist in this process, if at all. I started with zero subscribers and found
a writer with just over a hundred subscribers and another Substacker who had just over a thousand. They would be my test cases. Both are gifted writers who deserve a much bigger audience than they have. Real people expressing real life experience.Unless you know these writers and search for them by name it is difficult to find their work on the Substack app. Its search functionality seems to favor the top writers on the platform. I started looking further afield and checked if I could find them on the mainstream Search engines, Google and Bing. I know a thing or two about SEO and understood that with zero subscribers my own Substack would not be listed. But what about Vanda's and Sally's?
Can I find their Substacks on Search?
I tried Google
I started with Google and saw that Vanda's articles could be found but not Sally's. All I could find for Sally was a link to her main Substack page. She told me that she had reached 100 subscribers on the 4th of July 2023. Independence day. But her posts were still not being indexed by Google over two months later.1
How about Bing?
Next I ran a check for them on Bing. I found more success there. Bing had indexed both Sally’s and Vanda’s posts. Hooray! My eyes were busy scanning through the results being shown for Vanda's work. I was just about to browse away. And then it happened.
Microsoft, the company behind Bing has been a major investor in OpenAI, the company behind ChatGPT. We are not talking speculative investment here. Microsoft is betting billions of dollars that this tech will pay off. ChatGPT is being lauded as one of the most successful computer applications ever, experiencing a meteoric rise and uptake.
Bing has been unsuccessfully competing with Google for quite some time in the online search space. Despite being a better search engine in some respects, it has failed to make any dents in Google's dominance. Critics claim that Google has squeezed out competitors using monopolistic powers. Anti-trust cases are in progress in several countries arguing this very point.
Microsoft recently integrated OpenAI's chatbot into Bing in a move that caught Google off balance and rushing to catch up. With this back story in place I will now return to my narrative.
The story continues
As I was saying, I was looking through Vanda’s search results on Bing. And then a little box started appearing on the side of the search results. I was quite surprised and my curiosity stopped me from clicking away. The box looked like it was thinking about something and then produced this text:
Surprise is a word that could sum up my feelings as I started to read this text generated from Bing’s newly integrated AI system.
Notice the thumbs up and down icons there at the top. Bing are trying to improve this AI by collecting user feedback. Large language models (LLMs) like this one are known to make mistakes when they produce text. No one knows exactly how this happens. The algorithm sometimes conjures stuff out of thin air in a phenomena commonly known as an AI hallucination. Or as we say down my rabbit hole, they make shit up. In this case it looks like Bing’s AI has not simply swallowed a micro-dose, Silicon Valley style. Perhaps it has opted for the full Bay Area trip. Let me explain.
It is nice for Bing to big up Vanda’s Substack. But the truth is she doesn’t have thousands of subscribers. It’s just over a thousand. And she doesn’t proclaim in her Substack that she is a comic and a drunkard. She states that she was a comic and a drunkard. There is a difference there. But the AI doesn’t understand that or even care. It never will because it is not conscious. As another Substacker put it quite well, the chat bots are just autocomplete on steroids.
Of course Microsoft know this. In fact they put the following disclaimer on their search page:
I love the language. Surprises and mistakes are possible. Served up for your enjoyment. I strongly recommend you check out Vanda’s often hilarious Substack to find the truth behind Bing’s AI’s messaging.
These hallucinations are one thing. The fact that our Substacks are starting to be read by AI systems is another thing altogether. Is that something we signed up for?
Can this be stopped?
The short answer is yes and no.
Both ChatGPT and Google are allowing websites to block access to their AI bots using a technical setting found in the robots.txt file assoicated with a website. On Substack, this file is controlled by the Substack team not by the writers.
I have not been able to find any comments on this topic from the Substack team, but am interested to know what they think about it. The folks over at medium seem to be saying ‘No’ to the AI bots.
Rogue AI can still be a problem
You must understand that this request to block the AI crawlers is a voluntary protocol. The web is an open system and there is nothing stopping rogue AI bots ignoring these directives and scraping your content anyway.
What the Substack team think about this is one thing, but don’t we the writers have a say in the matter? What do you think? Let everyone know by completing the poll below. Substack may even take note and do something about it.
If you have read this far, please consider subscribing below. I’ll keep you informed with more interesting thoughts about how technology impacts our lives. It’s free!
This is changing. Substack has given Holly a sitemap and her pages are slowly starting to get indexed on Google.
There is now a setting to block the AI bots from your Substack Pages. Go to your Dashboard, choose Settings and scroll down to publication details to find the option.
Thanks for the tag. I appreciate your explanation here. AI a good reason to go paid, which will probably be the only way to find quality material on the web moving forward. So much for a democratized internet. I do wish there were more regulations on AI. Its destructive potential goes further than its convenience. I'm on a panel about this topic in a few days, and I'll properly cite this article.