More

bavell · 2026-04-24T12:42:23 1777034543

> I would also expect to see it taking exponentially longer to process a prompt. I don't believe LLMs work like that.

Try this out using a local LLM. You'll see that as the conversation grows, your prompts take longer to execute. It's not exponential but it's significant. This is in fact how all autoregressive LLMs work.

bavell · 2026-04-24T12:19:23 1777033163

Yesterday I was playing around with Gemma4 26B A4B with a 3 bit quant and sizing it for my 16GB 9070XT:

  Total VRAM: 16GB
  Model: ~12GB
  128k context size: ~3.9GB

At least I'm pretty sure I landed on 128k... might have been 64k. Regardless, you can see the massive weight (ha) of the meager context size (at least compared to frontier models).

bavell · 2026-04-24T12:08:02 1777032482

> As a user, I _expect_ the cost of resuming X hours/days later to be no different to resuming seconds or minutes later.

As an informed user who understands his tools, I of course expect large uncached conversations to massively eat into my token budget, since that's how all of the big LLM providers work. I also understand these providers are businesses trying to make money and they aren't going to hold every conversation in their caches indefinitely.

andrewingram · 2026-04-24T13:23:09 1777036989

I'd hazard a guess that there's a large gulf between proportion of users who know as much as you, and the total number using these tools. The fact that a message can perform wildly differently (in either cost, or behaviour if using one of the mitigations) based on whether I send it at t vs t+1 seems like a major UX issue, especially given t is very likely not exposed in the UI.

bavell · 2026-04-25T13:12:34 1777122754

I definitely agree that it should be shown and obvious in the UI. They do show a warning now when resuming old sessions but still could be better.

bavell · 2026-04-19T13:11:32 1776604292

Haven't had a chance to test 4.7 much but one of my pet peeves with 4.6 is how eager it is to jump into implementation. Though maybe the 4.7 is smarter about this now.

bavell · 2026-04-19T13:07:57 1776604077

The system prompt is always loaded in its entirety IIUC. It's technically possible to modify it during a conversation but that would invalidate the prefill cache for the big model providers.

bavell · 2026-04-17T12:34:13 1776429253

Is it exponential or logistic?

bavell · 2026-04-16T12:16:51 1776341811

Nope, the original tariffs were under IEEPA, then Supreme Court ruled they didn't have authority to use IEEPA, so they had to drop those tariffs and start working on refunds. It'd only have been illegal if they kept the tariffs after the ruling.

Lot of propaganda & emotions around this straightforward chain of events.

mindslight · 2026-04-16T14:11:25 1776348685

Under this reasoning, it's not illegal to just take things from stores (stores hate this one simple trick). If you're caught and your specific actions are then adjudicated to be illegal, at that point you can just start making a plan to bring the items back (even if some are used/damaged/etc) and everything is fine.

In reality of course, the actions were illegal the whole time. The big festering problem is that there is no actual punishment for government agents who break the law.

bavell · 2026-04-22T14:43:40 1776869020

Definitely some problems in the current system, broad and creeping executive overreach extending back decades now.

Pretty sure stealing from stores is already illegal, not sure I understand your analogy... lots of case law / precedent there.

mindslight · 2026-04-22T15:23:52 1776871432

The existence of case law / precedent does not affect whether something is "already illegal", but rather only how strongly one can predict if something is illegal. The original tariffs were illegal from day 1.

The point of the analogy was exactly to point at something with a lot of case law where this dynamic is crystal clear (although if Trump starts petty shoplifting after he's done looting our government, it's even odds whether this corrupt "court" will find some way to excuse it. Anything for the cause, of course)

Glemllksdf · 2026-04-16T15:33:16 1776353596

So the USA was under

Wikipedia on IEEPA: "An Act with respect to the powers of the President in time of war or national emergency. "?

I mean thats very wishi washi. So are we both aligned that it looks like missuse? Because if its only about a word definition of no its not illegal what he did but a clear missconduct than it feels like word play.

bavell · 2026-04-22T14:39:56 1776868796

I do agree it was a weak case, I think SCOTUS ruled correctly.

bavell · 2026-04-15T14:56:52 1776265012

https://marginlab.ai/trackers/claude-code/

bavell · 2026-04-13T12:45:10 1776084310

Try it before you give up, I got plenty of AI stuff working on my 6750XT years ago.

bavell · 2026-04-13T12:42:19 1776084139

Eh, YMMV. I was using rocm for minor AI things as far back as 2023 on an "unsupported" 6750XT [0]. Even trained some LoRAs. Mostly the issues were how many libs were cuda only.

[0] https://news.ycombinator.com/item?id=43207015