Yes I suppose this is the bare minimum, but isn’t that just a nasty way to go about things? What about responsibility and decency, do we just not do that anymore?
How do you know that the llm is correctly translating the english queries to the verifiable primitives? It seems like it’s just pushing the problem to another layer?
It is kind of a fundamental risk of IMDS, the guest vms often need some metadata about themselves, the host has it. A hardened, network gapped service running host side is acceptable, possibly the best solution. I think the issue is if your IMDS is fat and vulnerable, which this article kind of alludes to.
There’s also the fact that azure’s implementation doesn’t require auth so it’s very vulnerable to SSRF
You could imagine hosting the metadata service somewhere else. After all there is nothing a node knows about a VM that the fabric doesn’t. And things like certificates comes from somewhere anyway, they are not on the node so that service is just cache.
Hosting IMDS on the host side is pretty much the only reasonable way to provide stability guarantees. It should still work even if the network is having issues.
That being said, IMDS on AWS is a dead simple key-value storage. A competent developer should be able to write it in a memory-safe language in a way that can't be easily exploited.
“No, there is another”—Yoda, The Empire Strikes Back :)
What you describe carries the risk that secrets end up in crash dumps and be exfiltrated.
Imagine an attacker owns the host to some extent and can do that. The data is then on disk first, then stored somewhere else.
You probably need per-tenant/per-VM encryption in your cache, since you can never protect against someone with elevated privileges from crashing or dumping your process, memory-safe or not.
Then someone can try to DoS you, etc.
Finally it’s not good practice to mix tenant’s secrets in hostile multi-tenancy environments, so you probably need a cache per VM in separate processes…
IMHO, an alternative is to keep the VM's private data inside the VMs, not on the host.
Then the real wtf is the unsecured HTTP endpoint, an open invitation for “explorations” of the host (or the SoC when they get there) on Azure.
eBPF+signing agent helps legitimate requests but does nothing against attacks on the server itself; say, you send broken requests hoping to hit a bug. It does not matter if they are signed or not.
This is a path to own the host, an unnecessary risk with too many moving parts.
Many VM escapes abuse a device driver, and I trust the kernel guys who write them a lot more than the people who write hostable web servers running inproc on the host.
Removing these was a subject of intense discussions (and pushbacks from the owning teams) but without leaking any secret I can tell you that a lot of people didn’t like the idea of a customer-facing web server on the nodes.
Of course, putting the metadata service into its own separate system is better. That's how Amazon does it with the modern AWS. A separate Nitro card handles all the networking and management.
But if you're within the classic hypervisor model, then it doesn't really matter that much. The attack surface of a simple plain HTTP key-value storage is negligible compared to all other privileged code that needs to run on the host.
Sure, each tenant needs to have its own instance of the metadata service, and it should be bound to listen on the tenant-specific interface. AWS also used to set the max TTL on these interface to 1, so the packets would be dropped by routers.
Mainly for getting managed-identity access tokens for Azure APIs. In AWS you can call it to get temporary credentials for the EC2’s attached IAM role. In both cases - you use IMDS to get tokens/creds for identity/access management.
Client libraries usually abstract away the need to call IMDS directly by calling it for you.
Thank you, and everyone else who responded. So then this type of service seems to be used by other cloud providers (AWS). What makes this Azure service so much more insecure than its AWS equivalent?
Having it running on host (!), and the metadata for all guest VMs stored and managed by the same memory/service (!!), with no clear security boundary (!!!).
It's like storing all your nuke launch codes in the same vault, right in the middle of Washington DC national mall. Things are okay, until they are not okay.
This is insane, when you say azure OpenAI, do you mean like github copilot, microsoft copilot, hitting openai’s api, or some openai llm hosted on azure offering that you hit through azure? This is some real wild west crap!
I have noticied a similar bug on Copilot. I noticed a chat session with questions that I had no recollection of asking. I wonder if it's related. I brushed it off as the question was generic.
In my small sample size of a bit over a 100 accidentally leaked messages, many/most of them are programming related questions.
It's easy to brush it off as just LLM hallucinations. Azure OpenAI actually shows me how many input tokens were billed, and how many input tokens checked by the content filter. For these leaked responses, I was only billed for 8 input tokens, yet the content filter (correctly) checked >40,000 chars of input token (which was my actual prompt's size).
OP never mentioned letting the agent run as him or use his secrets. All of the issues you mention can be solved by giving the agent it’s own set of secrets or using basic file permissions, which are table stakes.
Back to the MCP debate, in a world where most web apis have a schema endpoint, their own authentication and authorization mechanisms, and in many instances easy to install clients in the form of CLIs … why do we need a new protocol, a new server, a new whatever. KISS
> OP never mentioned letting the agent run as him or use his secrets
That is implicit with a CLI because it is being invoked in the user session unless the session itself has been sandboxed first. Then for the CLI to access a protected resource, it would of course need API keys or access tokens. Sure, a user could set up a sandbox and could provision agent-specific keys, but everyone could always enable 2FA, pick strong passwords, use authenticators, etc . and every org would have perfect security.
Chat gpt is a great name though — you “chat” with the “GPT” so its self informing (even if you dont know what a GPT is), it’s 4 syllables that roll off the tongue well together.
RSS, has no vowels, no information, and looks like an alphabet term you might see at the doctor’s office or in an HR onboarding form at a corpo.
In Japan it's now known colloquially as 「チャッピー」 ("Chappy" or "Chappie"). High praise that it has received such shortened and personified version so quickly.
Early in my career I would build something I thought was useful, deploy it, meet with people within the company to get people to start using it. A lot of effort for something that would have a positive impact. My manager would schedule a meeting with me, and with a look of panic open with, “why didn’t you tell me about this or why did you do this?”. I understand now that before you start something, you need to decide who you are going to give credit to, and that person needs to be made aware that they will get credit for the project. Ideally your boss’s boss’s boss. Corporate caché only exist insofar as leadership allows it to exist, you gotta play the game. Pawns don’t get to take the glory for themselves.
Were you doing it on your own time? From your described “a lot of effort,” I assume it was not but please correct me if I’m wrong.
If you’re being paid for your time by someone else, it’s fair to notify them how you plan to use a significant chunk of that money before you do it. Unless of course you were employed to _not_ do that.
I am not suggesting explaining a day or two of work. But it sounds like you’re talking weeks.
It would be like if I was expected to deliver A by the end of the quarter and instead I delivered A + B. The value gain from B was more than A. Your manager (and hopefully higher up the org) better know about B, or they will attack it as a threat.
Also, I’m not being paid for my time, I’m being paid to do a job. “Trading your time for money” is one of the most self defeating views on work you can have. It reduces you from a worker with agency to a detached prostitue, and is harmful to both the employer and employee.
Like it, a lot. I think the future of software is going to be unimaginably dynamic. Maybe apps will not have statically defined feature sets, they will adjust themselves around what the user wants and the data it has access to. I’m not entirely sure what that looks like yet, but things like this are a step in that direction.
> I think the future of software is going to be unimaginably dynamic.
>...I’m not entirely sure what that looks like yet, but things like this are a step in that direction.
This made me stop and think for a moment as to what this would look like as well. I'm having trouble finding it, but I think there was a post by Joe Armstrong (of Erlang) that talked about globally (as in across system boundaries, not global as in global variable) addressable functions?
reply