More

jmogly · 2026-05-11T01:08:21 1778461701

Yes I suppose this is the bare minimum, but isn’t that just a nasty way to go about things? What about responsibility and decency, do we just not do that anymore?

jmogly · 2026-05-04T12:31:57 1777897917

How do you know that the llm is correctly translating the english queries to the verifiable primitives? It seems like it’s just pushing the problem to another layer?

jmogly · 2026-04-03T03:52:19 1775188339

It is kind of a fundamental risk of IMDS, the guest vms often need some metadata about themselves, the host has it. A hardened, network gapped service running host side is acceptable, possibly the best solution. I think the issue is if your IMDS is fat and vulnerable, which this article kind of alludes to.

There’s also the fact that azure’s implementation doesn’t require auth so it’s very vulnerable to SSRF

axelriet · 2026-04-03T04:25:33 1775190333

You could imagine hosting the metadata service somewhere else. After all there is nothing a node knows about a VM that the fabric doesn’t. And things like certificates comes from somewhere anyway, they are not on the node so that service is just cache.

cyberax · 2026-04-03T08:11:45 1775203905

Hosting IMDS on the host side is pretty much the only reasonable way to provide stability guarantees. It should still work even if the network is having issues.

That being said, IMDS on AWS is a dead simple key-value storage. A competent developer should be able to write it in a memory-safe language in a way that can't be easily exploited.

axelriet · 2026-04-03T10:35:05 1775212505

“No, there is another”—Yoda, The Empire Strikes Back :)

What you describe carries the risk that secrets end up in crash dumps and be exfiltrated.

Imagine an attacker owns the host to some extent and can do that. The data is then on disk first, then stored somewhere else.

You probably need per-tenant/per-VM encryption in your cache, since you can never protect against someone with elevated privileges from crashing or dumping your process, memory-safe or not.

Then someone can try to DoS you, etc.

Finally it’s not good practice to mix tenant’s secrets in hostile multi-tenancy environments, so you probably need a cache per VM in separate processes…

IMHO, an alternative is to keep the VM's private data inside the VMs, not on the host.

Then the real wtf is the unsecured HTTP endpoint, an open invitation for “explorations” of the host (or the SoC when they get there) on Azure.

eBPF+signing agent helps legitimate requests but does nothing against attacks on the server itself; say, you send broken requests hoping to hit a bug. It does not matter if they are signed or not.

This is a path to own the host, an unnecessary risk with too many moving parts.

Many VM escapes abuse a device driver, and I trust the kernel guys who write them a lot more than the people who write hostable web servers running inproc on the host.

Removing these was a subject of intense discussions (and pushbacks from the owning teams) but without leaking any secret I can tell you that a lot of people didn’t like the idea of a customer-facing web server on the nodes.

cyberax · 2026-04-03T17:32:23 1775237543

Of course, putting the metadata service into its own separate system is better. That's how Amazon does it with the modern AWS. A separate Nitro card handles all the networking and management.

But if you're within the classic hypervisor model, then it doesn't really matter that much. The attack surface of a simple plain HTTP key-value storage is negligible compared to all other privileged code that needs to run on the host.

Sure, each tenant needs to have its own instance of the metadata service, and it should be bound to listen on the tenant-specific interface. AWS also used to set the max TTL on these interface to 1, so the packets would be dropped by routers.

axelriet · 2026-04-06T20:59:31 1775509171

>negligible attack surface of a simple-plain HTTP…

…unless you use a general-purpose web server with its own set of challenges as far as policies and configuration. I’ll leave it there.

jmogly · 2026-04-03T05:24:27 1775193867

Ah yes great point, awesome article by the way —- thought provoking, shocking, really crazy stuff. Hopefully some good comes of it, godspeed.

jmogly · 2026-04-03T03:34:12 1775187252

Mainly for getting managed-identity access tokens for Azure APIs. In AWS you can call it to get temporary credentials for the EC2’s attached IAM role. In both cases - you use IMDS to get tokens/creds for identity/access management.

Client libraries usually abstract away the need to call IMDS directly by calling it for you.

dh2022 · 2026-04-03T05:25:33 1775193933

Thank you, and everyone else who responded. So then this type of service seems to be used by other cloud providers (AWS). What makes this Azure service so much more insecure than its AWS equivalent?

Thanks again!

[edited phrasing]

guardiangod · 2026-04-03T06:47:09 1775198829

Having it running on host (!), and the metadata for all guest VMs stored and managed by the same memory/service (!!), with no clear security boundary (!!!).

It's like storing all your nuke launch codes in the same vault, right in the middle of Washington DC national mall. Things are okay, until they are not okay.

axelriet · 2026-04-03T10:56:05 1775213765

Lovely explanation :)

jmogly · 2026-04-03T02:08:13 1775182093

This is insane, when you say azure OpenAI, do you mean like github copilot, microsoft copilot, hitting openai’s api, or some openai llm hosted on azure offering that you hit through azure? This is some real wild west crap!

nkozyra · 2026-04-03T02:47:26 1775184446

The latter, their arrangement with OpenAI enabled this.

pratyushnair01 · 2026-04-03T06:16:15 1775196975

I have noticied a similar bug on Copilot. I noticed a chat session with questions that I had no recollection of asking. I wonder if it's related. I brushed it off as the question was generic.

Manouchehri · 2026-04-03T12:43:34 1775220214

I would guess that Copilot uses Azure OpenAI.

In my small sample size of a bit over a 100 accidentally leaked messages, many/most of them are programming related questions.

It's easy to brush it off as just LLM hallucinations. Azure OpenAI actually shows me how many input tokens were billed, and how many input tokens checked by the content filter. For these leaked responses, I was only billed for 8 input tokens, yet the content filter (correctly) checked >40,000 chars of input token (which was my actual prompt's size).

SahAssar · 2026-04-03T14:12:04 1775225524

I'd assume they mean https://azure.microsoft.com/en-us/products/ai-foundry/models...

Manouchehri · 2026-04-03T14:20:39 1775226039

Correct.

jmogly · 2026-03-28T12:46:45 1774702005

Haha, you can already see wheel reinventors in this thread starting to spin their reinvention wheels. Nice stuff, I run my agents in containers.

jmogly · 2026-03-15T11:49:46 1773575386

OP never mentioned letting the agent run as him or use his secrets. All of the issues you mention can be solved by giving the agent it’s own set of secrets or using basic file permissions, which are table stakes.

Back to the MCP debate, in a world where most web apis have a schema endpoint, their own authentication and authorization mechanisms, and in many instances easy to install clients in the form of CLIs … why do we need a new protocol, a new server, a new whatever. KISS

CharlieDigital · 2026-03-15T13:41:27 1773582087

    > OP never mentioned letting the agent run as him or use his secrets

That is implicit with a CLI because it is being invoked in the user session unless the session itself has been sandboxed first. Then for the CLI to access a protected resource, it would of course need API keys or access tokens. Sure, a user could set up a sandbox and could provision agent-specific keys, but everyone could always enable 2FA, pick strong passwords, use authenticators, etc . and every org would have perfect security.

That's not reality.

jmogly · 2026-03-09T11:08:20 1773054500

Chat gpt is a great name though — you “chat” with the “GPT” so its self informing (even if you dont know what a GPT is), it’s 4 syllables that roll off the tongue well together.

RSS, has no vowels, no information, and looks like an alphabet term you might see at the doctor’s office or in an HR onboarding form at a corpo.

wiether · 2026-03-09T15:03:05 1773068585

Randos are just calling it "Chat" now.

"I'll ask Chat about x!"

msephton · 2026-03-10T01:28:53 1773106133

In Japan it's now known colloquially as 「チャッピー」 ("Chappy" or "Chappie"). High praise that it has received such shortened and personified version so quickly.

tobr · 2026-03-09T15:15:56 1773069356

It’s the new ”I looked it up on wiki”.

youniverse · 2026-03-09T23:02:13 1773097333

I've heard 'just ai it' from high schoolers.

jmogly · 2026-02-01T14:18:43 1769955523

Early in my career I would build something I thought was useful, deploy it, meet with people within the company to get people to start using it. A lot of effort for something that would have a positive impact. My manager would schedule a meeting with me, and with a look of panic open with, “why didn’t you tell me about this or why did you do this?”. I understand now that before you start something, you need to decide who you are going to give credit to, and that person needs to be made aware that they will get credit for the project. Ideally your boss’s boss’s boss. Corporate caché only exist insofar as leadership allows it to exist, you gotta play the game. Pawns don’t get to take the glory for themselves.

nebezb · 2026-02-01T14:37:20 1769956640

Were you doing it on your own time? From your described “a lot of effort,” I assume it was not but please correct me if I’m wrong.

If you’re being paid for your time by someone else, it’s fair to notify them how you plan to use a significant chunk of that money before you do it. Unless of course you were employed to _not_ do that.

I am not suggesting explaining a day or two of work. But it sounds like you’re talking weeks.

jmogly · 2026-02-01T16:26:11 1769963171

It would be like if I was expected to deliver A by the end of the quarter and instead I delivered A + B. The value gain from B was more than A. Your manager (and hopefully higher up the org) better know about B, or they will attack it as a threat.

Also, I’m not being paid for my time, I’m being paid to do a job. “Trading your time for money” is one of the most self defeating views on work you can have. It reduces you from a worker with agency to a detached prostitue, and is harmful to both the employer and employee.

quijoteuniv · 2026-02-01T16:40:22 1769964022

Not sure i understand what your are trying to say, good communication is definitely important, if only to serve oneself

jmogly · 2026-01-09T22:13:46 1767996826

Like it, a lot. I think the future of software is going to be unimaginably dynamic. Maybe apps will not have statically defined feature sets, they will adjust themselves around what the user wants and the data it has access to. I’m not entirely sure what that looks like yet, but things like this are a step in that direction.

dmux · 2026-01-09T23:30:59 1768001459

> I think the future of software is going to be unimaginably dynamic.

>...I’m not entirely sure what that looks like yet, but things like this are a step in that direction.

This made me stop and think for a moment as to what this would look like as well. I'm having trouble finding it, but I think there was a post by Joe Armstrong (of Erlang) that talked about globally (as in across system boundaries, not global as in global variable) addressable functions?

cess11 · 2026-01-11T19:20:21 1768159221

Not sure if I've read such an article, but it would be a reasonable next step from the globally addressable processes of the BEAM VM.

As I understand it Unison tries to do something like that but that might be wrong.

https://www.unison-lang.org/