More

drewnick · 2026-05-31T01:21:40 1780190500

Wow, this is refreshing DX compared to iterating all messages like we did back in '24.

ghrl · 2026-05-31T14:04:35 1780236275

I would disagree. Having all the messages locally and sending them with the request means you can switch inference providers or even models mid-conversation. It also means that the provider doesn't store the entire context, which often contains massive parts of proprietary codebases, secrets and PII and instead the agent harness manages all that. While a simple `continue thread` API field might seem more convenient, the cost is still determined by the input token count and cache rate, so it just abstracts this crucial implementation detail away.

drewnick · 2026-05-28T18:53:50 1779994430

While all these models are nondeterministic a feature bump is still necessary as the same input can have wildly different output on a new model. For API users being able to pin a model is a necessity.

drewnick · 2026-04-19T19:54:08 1776628448

Great point on the playing with Linux growing up, it's second nature to me now.

I am always feeling like I'm doing something wrong running bare metal based on modern advice, but it's low latency, simple, and reliable.

Probably because I've been using linux since Slackware in the 90s so it's second nature. And now with the CLI-based coding tools, I have a co-sysadmin to help me keep things tidy and secure. It's great and I highly recommend more people try it.

drewnick · 2026-04-18T21:13:21 1776546801

The problem is a lot of this glue is proprietary by design at the various cloud services. I realize there are open source and alternative abstractions for a lot of of the same services, but there’s still quite a bit of glue if you’re on AWS, for example, and looking to move to bare metal.

But maybe I’m just thinking of the current capabilities of agents, and if we fast forward a couple years, even removing these abstractions or migrating will be very low friction.

reillyse · 2026-04-18T21:27:37 1776547657

But you can run most of the glue on your own dedicated instances.

I run k8s on a bunch of dedicated servers that are super cheap and I have all bells and whistles - just tell your coding agent to do it. You can literally design the thing you would never do yourself and it works brilliantly.

Postgres running on dedicated hardware replicated and with wal backups - easy just tell codebuff (my harness of choice) to do it. Then any number of firewalls, load balancers, bastion servers, etc. if you can imagine it , codebuff will implement it.

drewnick · 2026-04-18T21:10:31 1776546631

I too, am bravely using Claude for more DevOps. I run all of my virtual machines on proxmox atop bare metal servers I own and I’m just blown away at how quickly Claude can optimize and set up entire new networks across all of these machines. Truly feels like a coworker or well paid sysadmin.

pbgcp2026 · 2026-04-19T02:26:33 1776565593

"servers I own" - that's a temporary glitch and AI will fix it for ... someone else.

drewnick · 2026-04-17T03:22:21 1776396141

Agree. If you've ever spent serious time in the country with farmers, the level of ingenuity is impressive among many, and they benefit from it greatly. As the grandson of depression farmers, I noticed intelligence mattered a lot, even if just for survival.

drewnick · 2026-04-16T15:40:54 1776354054

Hasn't Opus 4.5 been famously consistent while 4.6 was floating all over the place?

JohnMakin · 2026-04-16T20:19:51 1776370791

I'm still on 4.5. My coworkers are describing a lot of problems I just don't have. I suspect it was some combination of the larger context window, the model itself, and various bugs like the cache miss thing reported a little while ago.

YZF · 2026-04-17T04:34:49 1776400489

For me 4.6 has been a noticeable leap in performance from 4.5. I'm not missing 4.5 at all.

drewnick · 2026-04-16T15:32:35 1776353555

I think this is more about which model you steer your coding harness to. You can also self-host a UI in front of multiple models, then you own the chat history.

drewnick · 2026-04-06T03:40:51 1775446851

Same, mostly. However today in particular Claude can't do front-end development with any competence. Never seen this before. I think the rumor is they are rolling out a new model and have to divide their infra across the new model vs the current model.

drewnick · 2026-03-31T19:48:32 1774986512

You can use Claude Code with API mode (not a sub)

simianwords · 2026-03-31T20:03:05 1774987385

fair but I'm guessing access would be limited to 20x max users or something like that. not gated by API.