If you replace a junior with #LLM and make the senior review output, the reviewer is now scanning for rare but catastrophic errors scattered across a much larger output surface due to LLM "productivity."

Shane Celis

@pseudonym TIRED: 10x developer

HIRED: 10x junior intern

ALSO TIRED: Senior developer reviewing junior's copious output.

Tristan Clément

@pseudonym Recent Microsoft update releases seem to be a great case study for that

moink

@pseudonym That and LLM code often looks very nice on the surface so it takes a lot of vigilance and thinking to find the subtle errors. Code from juniors tends to have more immediate signs of errors or wrong mental models.

midnightnettle

@xrisk @mehluv might be able to provide more insight on this, but at least when I was writing content and AI was getting integrated into our work, the expectation was to review high volume of written content much faster for our editors. And we fully made many fuck ups due to that, because it is overwhelming. I assume this might also be the case, but I might be fully wrong. It is not just that the amount of code written is high volume, but also the expected pace of reviewing also is accelerated. Because what is the point of automating stuff if the reviewing process neutralizes the gains?

Malstrøm :damnified:🧉

@xrisk @pseudonym Volume is a key factor here. But even if the volume was the same, LLMs are doomed to stagnate as devs—whose code was scraped for training data—are displaced.

ada

@pseudonym That is why they don't replace juniors in aviation, nuclear, and radiology - only in non-critical industry.

If the cost of potential failure times the estimated failing rate is smaller than the total labour cost of screening, interviewing, training juniors, plus firing cultural misfits - then business replaces it.

Not only it saves HR operating cost and internal training cost - they can also hang a mistake on a senior reviewer.

And the review model has a positive productivity projectile as they have a stable improvement curve, unlike human.

Rishav

@malstrom @pseudonym that’s an interesting claim. I don’t know enough about LLM research to make a judgement. I do know that LLMs trained on synthetic (other LLM-generated) data tend to perform worse, but have we reached the limits of what LLMs are capable of? In my limited understanding, if an LLM can “learn” fundamental programming “concepts” (the same way they can “learn” concepts across human languages — I could be wrong in my understanding here), they should (might?) be able to transfer/apply those concepts to not-before-seen domains (maybe with a bit of “reasoning” prodded in).

Moutmout

@pseudonym This.

I do a lot of "computer science labs", where students learn to write code, and they wave me down when they have questions. When their code doesn't do what they expect, it's often easy to figure out what went wrong because you can spot a bit of code that looks funky. And usually, the problem is in those few lines.

LLM code is meant to look like good code, so you don't get these little shortcuts.

toldtheworld

@pseudonym I have posed this conundrum before and the answer I received is that there is also an opportunity cost to not moving faster and the risk of a catastrophic bug may not outweigh the risk of being overtaken by competitors, especially since that was already happening before LLMs anyway.

Also, it *seems* models are improving at detecting these bugs, so they are being used to review changes, which, for the reasons you point out, they might be better at than people.

Krzysztof Sakrejda

@xrisk @malstrom @pseudonym just for clarity, LLMs don't learn concepts

Krzysztof Sakrejda

@moink @pseudonym one of the benefits of people *having* a mental model

nora 🐭 (she/her)

@hopeless @pseudonym you are suggesting that you can just layer more shit onto the shit and after enough layers of shit it becomes not shit.

Dibs

@pseudonym also, when the senior retires, who replaces them?

Max

@pseudonym This, %100. The Glass Cage by Nicholas Carr dives into this in depth with examples from aviation, and how full-automation of flight, makes it harder to recover from a disaster situation for pilots.

Deborah Preuss, pcc 🇨🇦

@pseudonym @mayintoronto … and: there will be no juniors to grow into seniors.

Sir Dr Rusty o the Isle 🖤💛❤️

@ELS @avuko @pseudonym Exactly this. The #AI_Slop is growing exponentially which in turn increases the slop bucket depth and size which in turn has already degraded the quality and validity of search engine results. Some estimates have put the accuracy and degradation at 20-35% *worse*. So having the exponential growth of #AI_Slop is in turn DEcreasing the accuracy and value of *search* exponentially as well. Doing all of that on *bigger and faster* machines and #LLMs will only hasten the processes in play and dramatically increase the probability of truly catastrophic outcomes and consequences.

And that is the case already in play, without bringing in all the issues raised in Bender and Hanna's recent book (mandatory reading)

The AI Con: How To Fight Big Tech's Hype and Create the Future We Want : Bender, Emily M.: Amazon.com.au: Books

(www.amazon.com.au)

My first encounter with so-called "artificial intelligence" was in 1964-5 as an undergrad psychology student in an (snail mail) exchange with one of the pioneer researchers at Stanford. I've been involved in parts of it and tracked it ever since. It is critical to understand that it has taken OVER 60 YEARS to get to the mediocre state we are now in. It didn't happen "yesterday" or even in "the last 2 years" as some snake oil #AI_Salesmen would have everyone believe.
Time to #BeCarefulWhatYouWishFor

And its now 2026...

The Psychotic Network Ferret

@pseudonym We are using AI inexactly the worst ways possible.

Caveat: I am a never AI-er, due to the ethical issues surrounding how training data is gathered, the severe ecological and economic impacts, and the fact that deepfakes are objectively making the world a shittier place.

But pretend for a second, none of those are a problem anymore. We are still using AI wrong. You don't have it produce a mountain of code and have a human review it. You still use humans to produce the code, and have AI help other humans to review it. AI isn't terribly good at writing code, but it has been shown to be effective at finding a few classes of bugs humans are typically very bad at finding.

But that won't allow you to fire people and replace them with monkeys on typewriters, so it'll never happen.

⁂iwein⁂

@robinadams yes

I'm not sure if this is a but or an and...

The recent @squads blogpost by @EmmaDelescolle and @Tiziano notes that LLMs are good at reviews.

In an LLM friendly context, seniors will delegate shit work to LLM of course. So now we have the horrid situation where young coders don't learn coding, and senior teaching skills atrophy. I'm sure retrospectives on this are delegated to an LLM as we speak somewhere 🤪

Isn't this just the absolutely perfect shitstorm?

@pseudonym

JWcph, Radicalized By Decency

@pseudonym - and by costs of false positives.

⁂iwein⁂

@nor4 @hopeless @pseudonym if hidden well enough, it's ok to step in it, right 🤪

Wandering Adventure Party

If you replace a junior with #LLM and make the senior review output, the reviewer is now scanning for rare but catastrophic errors scattered across a much larger output surface due to LLM "productivity."

The AI Con: How To Fight Big Tech's Hype and Create the Future We Want : Bender, Emily M.: Amazon.com.au: Books