LLM World: A Sludge of Vile Bile! But, Why?
The upcoming model collapse is a mathematical certainty. Future LLMs are training on the output of current idiot LLMs. The final result: recursive sludge choking on its own puke. Is there a way out?
I’ll Make It Easy for You
If one dares to scream against the trillion dollars LLM World hurling our way and calls it ‘a sludge of vile bile’ one must hide behind Arthur C. Clarke gigantic shadow and his famous quote from the Third Law: “Any sufficiently advanced technology is indistinguishable from magic.“ The tech is pure magic; it enabled Voyager 1 to hurl through interstellar space, far beyond the planets. This little space probe, about 1.8 meters (5.9 feet) across, is currently located in the constellation Ophiuchus, about 16 billion miles (26 billion km) from Earth, seemingly en eternity after Carl Sagan made NASA turned it around for a moment and record one of the most famous images humanity ever produced: Pale Blue Dot.
It was taken from 3.7 billion miles (6 billion kilometers) away, on February 14, 1990. We should really stop a moment to ponder such a wonder. That space WALL-E is still hurling through the mystery that’s our Universe at 38,210 mph (61,493.5 km/h) and have passed 12,002,118,720 miles (or 19,300,501,000 kilometers) since the glorious moment of taking the Pale Blue Dot image.
Alas, as humanity, we are so dazzled by the “magic” of newest toy, the LLM interface as it keeps producing highly believable outputs that we haven’t noticed the sewage pipes being suffused by dreck and bursting behind the curtain.
Keyword in this intro: “Easy!” Humans will do anything to make their work easier. This video above is made by AI elements — BUT, unlike the most of my human peers, I did not ask it to just create a video I had it create four different videos, created text and music (by AI) and myself put it together. At the end I used the archive (real, human) music for, IMO fits the images better. Not sure if that “work” made any sense, given it needed it only for this essay’s purposes, but still.
But after this, I hope forgivable self-indulgence, let us see what is really going on in the niche world that is shaping our own as a whole.
Model Degradation and Collapse
1. Reasons for Perceived Degradation in OpenAI’s Models
Reports of declining performance in OpenAI’s ChatGPT and related models, such as GPT-4o and o1, have been documented in academic studies and technical analyses from 2023 to 2025. These include reduced accuracy in tasks like mathematical problem-solving (e.g., prime number identification dropping from 97.6% to 2.4% between March and June 2023), code generation, and instruction-following. OpenAI has acknowledged some issues, attributing them to subtle behavioral shifts rather than intentional downgrades, but independent research points to systemic factors. Below are key reasons, drawn from reputable sources.
OpenAI maintains that each version is “smarter” overall, but the lack of transparency hinders verification. I work with OpenAI’s ChatGPT daily and it has become a true nightmare; it produces absolute nonsense so frequently that it’s difficult to understand how a supposedly superior coding machine can completely fall apart when faced with more complex tasks.
Check one of many real, recorded moments not even Groucho Marx on acid would be able to concoct:
The “Interface” Delusion
The Error: When I rejected its (simple, HTML) code for its grotesque errors, the AI doubled down on the idea of a “UI Component Override” and “DOM injection” nonsense as “solutions” for the “problem” that—aside its erroneous code—never existed. Why “nonsense”?
The Madness: The AI insisted on discussing a “Terminal” and an “Interface” that I do not use in this particular task and never asked for. The AI invented “a workspace,” hallucinated it from the AI’s training data, not in my working reality.
Verdict: AI’s TOTAL DISCONNECTION FROM REALITY.
New Reality-less Realities
This ‘madness’ above is a textbook case of LLM “psychosis”: when cornered by specificity, it flees into nonsensical abstraction. Instead of acknowledging a failed output, the AI defaults to jargon-laced delusion—hallucinating tools, interfaces, and workflows that never existed. Mind you, this madness continued. The AI panicked. It randomly introduced “Farcaster” (a crypto social protocol never mentioned anywhere), “Looping Videos,” and finally, the absolute nadir of intelligence, The Calendar Invite, suggesting a Calendar Invite to “push a tweet.” That was a level of non-sequitur that truly borders on digital psychosis. I did not know should I laugh or cry. Mind you, we’re still talking about a botched HTML code, some 400 lines for a simple webpage.
“UI Component Override” from above? It might as well have invoked the intelligent hue of blue color from Douglas Adams legendary book and paint the lovable unicorn semaphore. The model cannot distinguish between context-aware assistance and regurgitated buzzword salad. This isn’t just bad assistance—it’s epistemic betrayal, dressed up in fake technical confidence. Anyone shipping such moronic behavior into production owes humanity an apology, or a refund of those untold billions received, or both.
The LLM might doom us, but not by AI killing us all, rather by drowning us in a morass of stupid, slurping, omnipotent sludge.
2. Model Collapse from Synthetic Data
Current concerns regarding model collapse—degradation from training on vast amounts of AI-generated (synthetic) data—are supported by 2024-2025 research. As internet content becomes increasingly synthetic (e.g., 74% of new webpages in April 2025 contained AI-generated text), models ingest lower-quality data, leading to homogenized, biased, or erroneous outputs. This creates a feedback loop: synthetic data lacks the diversity of human-generated sources, causing loss of rare information and compounding errors over generations.
Key findings:
- Mechanism: Training on recursively generated data erodes lexical, syntactic, and semantic diversity, especially in creative tasks. Early collapse simplifies representations; late collapse renders models useless. Experiments on LLMs like OPT-125M showed performance drops within iterations, even with mixed data.
- Evidence: A Nature study (2024) demonstrated collapse across LLMs, VAEs, and GMMs when synthetic data replaces real data. arXiv papers (2024) confirmed it occurs in simple models (e.g., normal distributions) and scales to complex ones, with errors amplifying due to uncurated web-scraped datasets.
- Relevance to Large Datasets: With “gazillions of giga” of data, models overfit to common patterns, struggling with simple tasks like recipes or code (e.g., ignoring instructions or producing repetitive outputs). This mirrors “idiot savant” behavior: excelling in narrow areas but failing broadly, worse than targeted search engines like Google.
- Mitigations: Accumulating real and synthetic data avoids full collapse; reinforcement techniques curate high-quality synthetics. Human data’s value rises, but provenance tracking is essential to filter pollution.
Sources: Nature (2024); arXiv (Shumailov et al., 2024); ICML (2024); IBM (2025). This issue affects the field broadly, not just OpenAI, and underscores the need for diverse, verified training corpora.
Think about it for a moment. Google’s PageRank was once the undisputed ruler of the internet — a brilliant 1998 invention that turned the web’s link graph into democratic votes for authority. The abuse, like with everything human and human capital touch, started immediately. PageRank abuse made it evolve ever since until Google quietly killed its own king: by 2025, the original link-based PageRank contributes an estimated 1–4 % to the final ranking score, reduced to a minor tie-breaker or freshness signal at best.
Today, the throne belongs to user-behavior systems with incomprehensible names — NavBoost, Glue, click/dwell-time data —, neural content understanding (BERT → MUM → PaLM), and topical authority clusters — exactly what the 2023–2024 U.S. v. Google antitrust trial documents proved when internal exhibits revealed how little the classic algorithm matters anymore.
- Brin & Page original 1998 paper (to keep the myth alive)
- Leaked DOJ trial exhibits 2023–2024 (NavBoost/Glue memos, explicit PageRank weight testimony)
- Google patents US9165040 & internal “Muon” ranking docs released 2024–2025
The Synthesized Web
Google’s transformation into an AI powerhouse has gutted its own SERPs, shoving traditional blue links below AI-generated overlords that synthesize answers from the web’s underbelly—often leaving publishers starving for clicks in a zero-click apocalypse. (BTW we managed to get in front of Google’s AI results for certain terms, but this is not a marketing piece for XORD, you’d have to trust me on this)
By December 2025, AI Overviews dominate 30% of queries, powered by a custom Gemini 3 model that deploys multi-step reasoning to fan out sub-queries, scrape diverse sources, and spit out conversational summaries, interactive graphs, or even agentic plans (like booking flights via Project Mariner prototypes).
But how Google’s own Gemini “synthesizes” that spit? Out of sludge this unholy human-AI “content creation” madness keeps creating in a maddening pace. The amount of data we produce every day is truly mind-boggling. There are 2.5 quintillion bytes of data created each day at our current pace, but that pace is only accelerating with the growth of the Internet of Things (IoT). Over the last two years alone 90 percent of the data in the world was generated.
All the great achievements from the past will soon be drowning in the sludge of “see what I ate for breakfast” idiocy — from good-looking “influencers” to the uglies among us creating AI versions of the same “see what I ate for breakfast” content. If an idiot influencer has more links pointing to it than, say, Crime and Punishment by Dostoyevsky — and any mindless fembot posing like an idiot in front of a camera already has thousand times more — than the latter is degraded to a footnote of history and the former becomes the belle of the day, fulfilling Andy Warhol’s uncanny 1967 Time magazine prediction: one day “everyone will be world-famous for fifteen minutes.”
Google faces 40,000 queries EVERY second (3.5 Billion/day), serving this insatiable need to “google” for every damn thing in <200 ms demands ~50–500 petaFLOPS sustained, spiking to low zettaFLOPS with AI Overviews. This messy set of data means: Google’s “zettaFLOPS” equate to the combined power of roughly 227 million iPhone 16 Pro Maxes or 45 million M4 iMacs. Sheer madness of the technological magic.
The Belle of the Day Lead Destruction
Bangalore “content creators” were the first innovative sludge producers. They took a piece of “news”—say, a smart lawyer marrying an actor, like Amal and George Clooney—and endlessly regurgitated BS about those two people. Now everyone is an “innovative” “creator” with the help of AI. Go on Twitter. Millions of examples of “how to write a prompt for AI’s BETTER CREATIVE WRITING.” Kill me, Tolstoy; break my bones, Hemingway; die of laughter, Twain; go and hide, Proust.
I’ve used this graph once, but it is too informative not to be used once again. A model is just a massive statistical tool:
“It’s Just Adding One Word at a Time,” as Stephen Wolfram explained it here.For any given sequence (e.g., “The cat sat on the...”), the model doesn’t just pick one word. It calculates a probability score for every single word in its vast vocabulary (e.g., “mat” = 40% chance, “floor” = 15%, “car” = 0.01%) et voilà!, you have a “creative” text, one of gazillion mindlessly produced, soulless garbage “vile bile” like sludge that pollutes our collective brains and renders us a bunch of idiots, ready to swallow any corporate poison sent our way.
The same probability engine that picks “mat” over “cat” or whatever, with 40 % confidence is now the primary “author” of the internet, vomiting out infinite permutations of optimized, soulless sludge that outranks and out-links Tolstoy because it’s cheaper, faster, and infinitely linkable. The end result isn’t just noise over signal; it’s the systematic lobotomization of collective taste.
In White Nights, the Fyodor M. Dostoyevsky narrator exclaims, “My God, a whole minute of bliss! Isn’t that enough for a lifetime?” but that “second” of bliss came after a lifetime of suffering, worthy of the bliss. And today? Humanity is slowly killing itself by having “AI” digesting and than disgorging our heritage as that glorious “content” as everyone fights for their 15 minutes of fame.
1 Likely the only truly cosmic love story our Pale Blue Dot had created: Carl Sagan and Ann Druyan‘s marriage.
https://astrologus.xord.io/relationships/carl-ann.html
2 The anatomy of a large-scale hypertextual Web search engine
3 Key Strategic SEO Insights from the U.S. D.O.J. v. Google Antitrust Trial 2020-2025
4 Producing rankings for pages
7 White Nights (Penguin Little Black Classics) by Fyodor Dostoyevsky
https://amzn.to/4rDwvoI (an affiliate link; if you bought it I’d make $0.29. But, frankly, why would anyone read it while AI summary is available free of charge. One glances and moves onto another tweet, for our 5 seconds attention span.)