More

number6 · 2026-04-12T08:38:45 1775983125

At least it is the plot of a lot of movies of this era: Dr. stragelove, wargames, colossus.

number6 · 2026-04-10T13:39:19 1775828359

Reminds me of "the front fell off"

https://www.youtube.com/watch?v=3m5qxZm_JqM

wat10000 · 2026-04-10T19:43:22 1775850202

Maybe the data disappeared because BunnyCDN towed it out of the environment.

number6 · 2026-04-12T15:27:52 1776007672

Obviously it it no longer in the environment

number6 · 2026-04-08T06:33:59 1775630039

Ten more years and our tricks become the ancient practices and rituals of the olden hack0rs; stuff like apt update and apt upgrade on a fresh server. Some of the most ancient even use apt-get, while the really old ones will scold you for using Ubuntu

mech422 · 2026-04-08T06:56:19 1775631379

Hey! I resemble that comment :-P (I still tend to use apt-get out of habit...)

number6 · 2026-04-06T06:59:20 1775458760

This is also my first impulse. The second was, if this happened to me, I would not be able to recover it. All the custom c tool talk... If you ask Claude Code it will code something up.

Well that he recovered the disks is amazing in itself. I would have given up and just pulled a backup.

However, I would like to see a Dev saying: why didn't you use the --<flag> which we created for this Usecase

number6 · 2026-03-29T15:59:46 1774799986

Es gibt den auch als Film. Der ist auch nicht für schwache Gemüter

number6 · 2026-03-24T06:44:04 1774334644

This would have greatly helped me. I always was at a loss which trick I had to apply to solve this exam problem, while knowing the mathematics behind it. Just at some point you had to add a zero that was actually a part of a binomial that then collapsed the whole fromula

number6 · 2026-03-24T06:40:45 1774334445

But can it count the R's in strawberry?

Paradigma11 · 2026-03-24T07:06:27 1774335987

That question is equivalent to asking a human to add the wavelengths of those two colors and divide it by 3.

snovv_crash · 2026-03-24T07:28:33 1774337313

Unless you're aware of hyperspectral image adapters for LLMs they aren't capable of that either.

szszrk · 2026-03-24T07:40:22 1774338022

Unfair - human beats AI in this comparison, as human will instantly answer "I don't know" instead of yelling a random number.

Or at best "I don't know, but maybe I can find out" and proceed to finding out/ But he is unlikely to shout "6" because he heard this number once when someone talked about light.

koliber · 2026-03-24T07:51:24 1774338684

> human will instantly answer "I don't know" instead of yelling a random number.

Seems that you never worked with Accenture consultants?

szszrk · 2026-03-24T11:45:47 1774352747

Fair.

Yet this can be filtered with fixed rules, like "output produced by corporate structures is untrusted random data".

thegabriele · 2026-03-24T09:15:06 1774343706

Why is that?

Paradigma11 · 2026-03-24T10:33:36 1774348416

Because LLMs dont have a textual representation of any text they consume. Its just vectors to them. Which is why they are so good at ignoring typos, the vector distance is so small it makes no difference to them.

Aditya_Garg · 2026-03-24T06:47:29 1774334849

yes its ridiculously good at stuff like that now. I dare you to try and trick it.

frizlab · 2026-03-24T07:01:03 1774335663

https://news.ycombinator.com/item?id=47495568

thedatamonger · 2026-03-24T07:14:53 1774336493

what bothers me is not that this issue will certainly disappear now that it has been identified, but that that we have yet to identify the category of these "stupid" bugs ...

sigmoid10 · 2026-03-24T07:32:34 1774337554

We already know exactly what causes these bugs. They are not a fundamental problem of LLMs, they are a problem of tokenizers. The actual model simply doesn't get to see the same text that you see. It can only infer this stuff from related info it was trained on. It's as if someone asked you how many 1s there are in the binary representation of this text. You'd also need to convert it first to think it through, or use some external tool, even though your computer never saw anything else.

Measter · 2026-03-24T13:38:33 1774359513

> It's as if someone asked you how many 1s there are in the binary representation of this text.

I'm actually kinda pleased with how close I guessed! I estimated 4 set bits per character, which with 491 characters in your post (including spaces) comes to 1964.

Then I ran your message through a program to get the actual number, and turns out it has 1800 exactly.

sigmoid10 · 2026-03-26T15:57:33 1774540653

>I estimated 4 set bits per character, which with 491 characters in your post (including spaces) comes to 1964

And that's exactly the kind of reasoning an LLM does when you ask it about characters in a word. It doesn't come from the word, it comes from other heuristics it picked up during training.

datsci_est_2015 · 2026-03-24T09:05:52 1774343152

Okay but, genuinely not an expert on the latest with LLMs, but isn’t tokenization an inherent part of LLM construction? Kind of like support vectors in SVMs, or nodes in neural networks? Once we remove tokenization from the equation, aren’t we no longer talking about LLMs?

fenomas · 2026-03-24T10:07:20 1774346840

It's not a side effect of tokenization per se, but of the tokenizers people use in actual practice. If somebody really wanted an LLM that can flawlessly count letters in words, they could train one with a naive tokenizer (like just ascii characters). But the resulting model would be very bad (for its size) at language or reasoning tasks.

Basically it's an engineering tradeoff. There is more demand for LLMs that can solve open math problems, but can't count the Rs in strawberry, than there is for models that can count letters but are bad at everything else.

number6 · 2026-03-24T02:38:05 1774319885

Programming will become like knitting. You buy most of your cloth of the shelf, but there is a quality to a hand made pullover, well you wouldn't want to wear it, but you love to make it.

number6 · 2026-03-24T02:34:21 1774319661

Last year I got two coworkers. My first in terms of coding. First I looked at everyone code request, but it soone overwhelmed me. We got a third and there was no way I could oversee everything and since I got a team of three management gave me other responsibilities on top.

I have no idea what they code and how they code it. I only go over the specs with them. Everything got quicker but the quality went down. I had to step in and we now have e2e-Test for everything. Maybe it's too much, but bugs got squashed and catched before we shipped.

So that's a win. Before I could test everything by hand. I worked more on things like creating a working release cycle and what tools we should use.

With or without AI the situation would have been similar.

I became a manager. We move the needle. I don't really get to code anymore and I don't see much of the code. It's strange.

number6 · 2026-03-08T19:40:15 1772998815

AI can't hold copyright (as for now) how will this affect licencing?

votepaunchy · 2026-03-14T14:00:40 1773496840

The MIT License exists because in the US you cannot make a work public domain. AI works being public domain is effectively equivalent.