More

mungoman2 · 2026-03-25T08:12:05 1774426325

What they're saying is that the error for a vector increases with r, which is true.

Trivially, with r=0, the error is 0, regardless of how heavily the direction is quantized. Larger r means larger absolute error in the reconstructed vector.

amitport · 2026-03-25T08:26:14 1774427174

Yes, the important part is that the normalized error does not increase with the dimension of the vector (which does happen when using biased quantizers)

It is expected that bigger vectors have proportionally bigger error, nothing can be done by the quantizer about that.

moktonar · 2026-03-26T07:43:09 1774510989

Except maybe storing another smaller vector for the difference with the original data an also quantize that maybe recursively

mungoman2 · 2026-03-23T06:48:15 1774248495

This is cool. It makes storage of the KV cache much smaller, making it possible to keep more of it in fast memory.

Bandwidth-wise it is worse (more bytes accessed) to generate and do random recall on than the vanilla approach, and significantly worse than a quantized approach. That’s because the reference needs to be accessed.

I guess implied is that since the KV cache is smaller, the probability is higher that the parts it that are needed are in fast memory, and that bandwidth requirements of slow links is reduced, and performance goes up.

Would be interesting with a discussion about benefits/drawbacks of the approach. Ideally backed by data.

mungoman2 · 2026-03-19T06:36:00 1773902160

Well, the spec can of course define constraints of how the function is implemented.

mungoman2 · 2026-03-17T05:26:28 1773725188

Really good. I’ve struggled with the same thing.

> Instead of expecting it to understand my requests, I almost always build tooling first to give us a shared language to discuss the project.

This is probably the key. I’ve found this to be true in general. Building simple tools that the model can use help frame the problem in a very useful way.

mungoman2 · 2026-03-13T14:14:42 1773411282

Tbh shrinking the image is probably the cheapest operation you can do that still lets every pixel influence the result. It’s just the average of all pixels, after suitable color conversion.

LoganDark · 2026-03-13T15:12:38 1773414758

The author of the article seems to assume there is no color conversion (e.g., the resizing of the image is done with sRGB-encoded values rather than converting them to linear first). Which is a stupid way to do it but I'd believe most handwritten routines are just that.

bombcar · 2026-03-13T14:22:19 1773411739

It might work decently well, but I wonder if it makes it "visually" match - sometimes the perfect average is not what our eyes see as the color.

mungoman2 · 2026-03-06T20:09:24 1772827764

This is a very fun idea. Would also be very interesting to see if one could have a system where talking to an NPC could alter the world.

One maybe obvious way would be that asking for rumors will actually creates the scenario that the NPC describes.

mungoman2 · 2026-03-04T18:49:03 1772650143

Not sure what the uptime is meant to signal. People have quite low uptime as well…

jug · 2026-03-04T19:24:17 1772652257

Huh? Servers aren't people and thus have completely different expectations, or what am I missing here

greenchair · 2026-03-04T20:58:51 1772657931

uptime signals reliability

mungoman2 · 2026-03-03T06:44:22 1772520262

I think you’re implying that it would be useful to have the LLM predict the end of the speaker’s speech, and continue with its reply based on that.

If, when the speaker actually stops speaking, there is a match vs predicted, the response can be played without any latency.

Seems like an awesome approach! One could imagine doing this prediction for the K most likely threads simultaneously, subject by computer power available, and prune/branch as some threads become inaccurate.

mungoman2 · 2026-02-09T14:51:40 1770648700

But renewable is already cheaper than fossil fuels. Why don't we see this already?

coryrc · 2026-02-09T16:07:43 1770653263

Renewable PV is the cheapest way generate electricity during daytime at appropriate latitudes.

Notice several caveats: electricity, not heat; daytime, not nighttime; only for some places on the globe.

Most energy use doesn't use electricity. It's one thing to replace an average-16%-efficient internal combustion engine with electricity and another to replace a 96%-efficient condensing boiler.

gamblor956 · 2026-02-09T21:09:11 1770671351

Solar heating has been a thing for centuries.

coryrc · 2026-02-12T21:02:52 1770930172

https://en.wikipedia.org/wiki/Drake_Landing_Solar_Community

We could take all suburban United States off of fossil fuel heating with solar heating. But that would require planning up front and cost some powerful people money, so we can't.

api · 2026-02-14T15:52:54 1771084374

The problem is industry and dense urban centers. Light residential is not the thing that’s challenging to power.

api · 2026-02-10T13:19:01 1770729541

By heat I think the parent mostly means industrial process heat, which is mostly supplied by natural gas now. Coal is still used in metallurgy.

Electric heat is rare since it’s inefficient (thermodynamics) and thus expensive but it’s used in applications where you need precision temperature control.

Of course if solar and batteries got cheap enough you could just say F it and use electric resistance heat everywhere. Time your peak production to coincide with mid day when solar is at peak.

mungoman2 · 2026-02-08T15:42:03 1770565323

IMHO, this is not really about AI, it's about setting boundaries and not overwork yourself.