The thing I most want to use this (or some other WASM Linux engine) for is running a coding agent against a virtual operating system directly in my browser.
Claude Code / Codex CLI / etc are all great because they know how to drive Bash and other Linux tools.
The browser is probably the best sandbox we have. Being able to run an agent loop against a WebAssembly Linux would be a very cool trick.
I had a play with v86 a few months ago but didn't quite get to the point where I hooked up the agent to it - here's my WIP: https://tools.simonwillison.net/v86 - it has a text input you can use to send commands to the Linux machine, which is pretty much what you'd need to wire in an agent too.
In that demo try running "cat test.lua" and then "lua test.lua".
> The thing I most want to use this (or some other WASM Linux engine) for is running a coding agent against a virtual operating system directly in my browser.
Apptron uses v86 because its fast. Would love it for somebody to add 64-bit support to v86. However, Apptron is not tied to v86. We could add Bochs like c2w or even JSLinux for 64-bit, I just don't think it will be fast enough to be useful for most.
Apptron is built on Wanix, which is sort of like a Plan9-inspired ... micro hypervisor? Looking forward to a future where it ties different environments/OS's together.
https://www.youtube.com/watch?v=kGBeT8lwbo0
~20x slower for a naive recursive Fibonacci implementation in Python (1300 ms for fib(30) in this VM vs 65ms on bare metal. For comparison, CPython directly compiled to WASM without VM overhead does it in 140ms.)
~2500x slower for 1024x1024 matrix multiplication with NumPy (0.25 GFLOPS in VM vs 575 GFLOPS on bare metal).
This is not correct. You are using WebVM here, not BrowserPod.
WebVM is based on x86 emulation and JIT compilation, which at this time lowers vector instructions as scalar. This explains the slowdowns you observe. WebVM is still much faster than v86 in most cases.
BrowserPod is based on a pure WebAssembly kernel and WebAssembly payload. Performance is close to native speed.
I run agents as a separate Linux user. So they can blow up their own home directory, but not mine. I think that's what most people are actually trying to solve with sandboxing.
(I assume this works on Macs too, both being Unixes, roughly speaking :)
While this may be a better sandbox, actually having a separate computer dedicated to the task seems like a better solution still and you will get better performance.
Besides, prompt injection or simpler exploits should be addressed first than making a virtual computer in a browser and if you are simulating a whole computer you have a huge performance hit as another trade off.
On the other hand using the browser sandbox that also offers a UI / UX that the foundation models have in their apps would ease their own development time and be an easy win for them.
> The thing I most want to use this (or some other WASM Linux engine) for is running a coding agent against a virtual operating system directly in my browser.
Well, there it is, the dumbest thing I'll read on the internet all week.
Most of the engineering in Linux revolves around efficiently managing hardware interfaces to build up higher-level primitives, upon which your browser builds even higher-level primitives, that you want to use to simulate an x86 and attached devices, so you can start the process again? Somewhere (everywhere), hardware engineers are weeping. I'll bet you can't name a single advantage such a system would have over cloud hosting or a local Docker instance.
Even worse, you want this so your cloud-hosted imaginary friend can boil a medium-sized pond while taking the joyful bits of software development away from you, all for the enrichment of some of the most ethically-challenged members of the human race, and the fawning investors who keep tossing other people's capital at them? Our species has perhaps jumped the shark.
> while taking the joyful bits of software development away from you
Quick question: by "joyful bits of software development," do you mean the bit where you design robust architectures, services, and their communication/data concepts to solve specific problems, or the part where you have to assault a keyboard for extended periods of time _after_ all that interesting work so that it all actually does anything?
Because I sure know which of these has been "taken from me," and it's certainly not the joyful one.
I guess I enjoy solving problems, and recognize that the devil is always in the details, so I don't get much satisfaction until I see the whole stack working in concert. I never had much esteem for "architects" who sketch some blobs on the whiteboard and then disappear. I certainly wouldn't want to be "that guy" for anyone else, and I'm not even sure I could do it to an LLM.
> Well, there it is, the dumbest thing I'll read on the internet all week.
Rude.
In case you're open to learning, here's why I think this is useful.
The big lesson we've learned from Claude Code, Codex CLI et al over the past twelve months is that the most useful tool you can provide to an LLM is Bash.
Last year there was enormous buzz around MCP - Model Context Protocol. The idea was to provide a standard for wiring tools into LLMs, then thousands of such tools could bloom.
Claude Code demonstrated that a single tool - Bash - is actually much more interesting than dozens of specialized tools.
Want to edit files without rewriting the whole thing every time? Tell the agent to use sed or perl -e or python -c.
Look at the whole Skills idea. The way Skills work is you tell the LLM "if you need to create an Excel spreadsheet, go read this markdown file first and it will tell you how to run some extra scripts for Excel generation in the same folder". Example here: https://github.com/anthropics/skills/tree/main/skills/xlsx
That only works if you have a filesystem and Bash style tools for navigating it and reading and executing the files.
This is why I want Linux in WebAssembly. I'd like to be able to build LLM systems that can edit files, execute skills and generally do useful things without needing an entire locked down VM in cloud hosting somewhere just to run that application.
Here's an alternative swipe at this problem: Vercel have been reimplementing Bash and dozens of other common Unix tools in TypeScript purely to have an environment agents know how to use: https://github.com/vercel-labs/just-bash
I'd rather run a 10MB WASM bundle with a full existing Linux build in then reimplement it all in TypeScript, personally.
I agree, bash, sed, etc. are great, but a VM running inside a browser seems like the least efficient way to access them. Even if you're stuck on Windows, Cygwin has been a thing for 30 years now, and WSL for ten or so? There should be plenty of ways to set up a sandbox without having the simulate an entire machine.
It sounds like what you're really trying to recreate is the Software Tools movement from 50 years ago, where there was a push to port the UNIX/BTL utilities to the widest possible variety of systems to establish a common programming and data manipulation environment. It was arguably successful in getting good ports available just about anywhere, evolving into GNU, etc., but it never really reached its apotheosis. That style of clear, easy-to-read-and-write software was still largely killed off by a few big industry players pushing a narrative that "enterprise" has to mean relational databases and distributed objects. It would be FASCINATING if AI coding agents are the force that brings it back.
This isn't meant to be a daily driver. I'd like the option to build systems that occasionally run filesystem agent loops on an ad-hoc basis, for any user. A browser is a really good platform for that.
So are Cygwin and WSL, though, for those who don't already have the luxury of being on Linux or UNIX (incl. MacOS). I'm sure there are uses for running full-system emulators inside a browser, but access to bash and sed and gawk doesn't seem like one of them. Seriously, if that's the best way to get access to good text manipulation tools, why aren't you ditching your entire OS?
Because bash and sed and suchlike turn out to be the most useful tools for unlocking the abilities of AI agents to do interesting things - more so than previous attempts like
MCP.
> Linux RISC-V virtual machine, powered by the Cartesi Machine emulator, running in the browser via WebAssembly
> a single 32MiB WebAssembly file containing the emulator, the kernel and Alpine Linux operating system. Networking supports HTTP/HTTPS requests, but is subject to CORS restrictions
My demo here loads 12.7MB (if you watch the browser network panel) to get to a usable Linux machine, it even has Lua! https://tools.simonwillison.net/v86
But Docker is free (unless you're a fairly large business, in which case containerd is still free, and you can either pay for the front-end license or figure out how to set up one of the free alternatives), and from what perspective are the isolations available for the containerd process inferior to those available for your browser process? The former was at least designed from the ground up with security, auditing, quotas etc. in mind, and offers better per-container granular control than your browser offers per-tab.
I would argue the exact opposite: Linux is great, but it wasn't really designed with a focus on containing hostile software, and while containers have come to be a decent security barrier, they're still one kernel bug away from compromise. On the other hand, the browser is very accustomed to being the most exposed security-sensitive software on a machine, and modern browsers and wasm in particular are designed against that threat. Heck, wasm is so good for security that Mozilla started compiling components to wasm and then back into native code to get memory safety ( https://hacks.mozilla.org/2020/02/securing-firefox-with-weba... ).
tldr; devcontainers let you completely containerize your development environment. You can run them on Linux natively, or you can run them on rented computers (there are some providers, such as GitHub Codespaces) or you can also run them in a VM (which is what you will be stuck with on a Mac anyways - but reportedly performance is still great).
All CLI dev tools (including things like Neovim) work out of the box, but also many/most GUI IDEs support working with devcontainers (in this case, the GUI is usually not containerized, or at least does not live in the same container. Although on Linux you can do that also with Flatpak. And for instance GitHub Codespaces runs a VsCode fully in the browser for you which is another way to sandbox it on both ends).
This is interesting (and I've seen it mentioned in some editors), but how do I use it? It would be great if it had bubblewrap support, so I don't have to use Docker.
Do you know if there's a cli or something that would make this easier? The GitHub org seems to be more focused on the spec.
It's normal for HN to be preoccupied with the major technical trend of the moment, and this is unquestionably the biggest technical trend in many years.
People can argue about where to insert it in the list, but it is certainly in the top 5 of many decades (smartphones, web, PCs, etc.) That's why it's inescapable.
Your complaint isn't really about simonw's comment, but rather the fact that it was heavily upvoted - in other words, you were dissenting from the community reaction to the comment. That's understandable; in fact it's a fundamental problem with forums and upvoting systems: the same few massive topics suck in all the smaller ones until we get one big ball of topic mud: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que....
Has there ever been any other topic that was not only the subject of the majority of submissions, but also had a subset of users repeatedly butting into completely unrelated discussions to go "b-but what about <thing>? we need to talk about <thing> here too! how can I relate this to <thing>? look at my <thing> product!"?
You can't just roll in to a random post to tell people about your revolutionary new AI agent for the 50th time this week and expect them not to be at least mildly annoyed.
I'm with you, but he wasn't telling us about his agent, he was saying "this is a cool technology and I've been wanting to use it to make a thing". The thing just happened to be LLM-adjacent.
Almost all of his comments "just happen" to be LLM-adjacent. At some point it stops "just happening" and it becomes clear that certain people (or their AI bots) are frequenting discussion spaces for the sole purpose of seeking out opportunities to bring up AI and self-promote.
Simon has been here since way before LLMs were a thing, and it's fairly obvious (to me, at least) that he's genuinely excited about LLMs, he's not just spamming sales or anything.
The entire thing is just quotes and a retelling of events. The closest thing to a "take" I could find is this:
> I have no idea how this one is going to play out. I’m personally leaning towards the idea that the rewrite is legitimate, but the arguments on both sides of this are entirely credible.
Which effectively says nothing. It doesn't add anything the discussion around the topic, informed or not, and the post doesn't seem to serve any purpose beyond existing as an excuse to be linked to and siphon attention away from the original discussion (I wonder if the sponsor banner at the top of the blog could have something to do with that...?)
Literally just a quote from his fellow member of the "never stops talking about AI" club, Karpathy. No substance, no elaboration, just something someone else said or did pasted on his blog followed by a short agreement. Again, doesn't add anything or serve any real purpose, but was for some reason submitted to HN[1], and I may be misremembering but I believe it had more upvotes/comments than the original[2] at one point.
I think my coverage of the Mark Pilgrim situation added value in that most people probably aren't aware that Mark Pilgrim removed himself from internet life in 2011, which is relevant to the chardet story.
That second Karpathy example is from my link blog. Here's my post describing how I try to add something new when I write about things on my link blog: https://simonwillison.net/2024/Dec/22/link-blog/
In the case of that Karpathy post I was amplifying the idea that "Claw" is now the generic name for that class of software, which is notable.
It's very much a bimodal distribution: an enthusiast subset and an allergic subset. It's impossible to satisfy both, but that's the dynamic of HN anyhow: guaranteed to dissatisfy everybody! It's a strange game; the only to win is to complain.
> I mean I don’t have to remember the horrible git command line anymore
Every time I see a comment like this, I have to wonder what the heck other devs were doing. Don’t you know there were shell aliases, and snippet managers, and a ton of other tools already? I never had to commit special commands to memory, and I could always reference them faster than it takes to query any LLM.
The point I’m making is there are tons of solutions. Deterministic, fast, low-energy, customisable. Which is why I said “I have to wonder what the heck other devs were doing”. As in, have you never looked for a solution to your frustration? Hard to believe there was nothing out there before which wouldn’t have improved your Git command-line experience. Like, say, one of the myriad GUI tools which exist.
> Because it’s custom there is no standard curriculum you could point me to etc.
Not true. There are tons of resources out there not only explaining the solutions but even how different people use them and why.
If I sat with you for ten minutes and you explained me the exact difficulties you have, I doubt I couldn’t have suggested something.
Please don't cross into personal attack on this site. We ban accounts that do that, and you've unfortunately done it repeatedly in this thread. Current comment was the worst case of this by far, but https://news.ycombinator.com/item?id=47317411, for example, is also on the wrong side of the line.
Maybe not, but in the past some here see that the blog is the product that is being promoted here.
Even in this thread alone https://news.ycombinator.com/item?id=47314929 some commenters here are clearly annoyed with the way AI is being shoved in each place where they do not want it.
I don't care, but I can see why many here are getting tired of it.
Claude Code / Codex CLI / etc are all great because they know how to drive Bash and other Linux tools.
The browser is probably the best sandbox we have. Being able to run an agent loop against a WebAssembly Linux would be a very cool trick.
I had a play with v86 a few months ago but didn't quite get to the point where I hooked up the agent to it - here's my WIP: https://tools.simonwillison.net/v86 - it has a text input you can use to send commands to the Linux machine, which is pretty much what you'd need to wire in an agent too.
In that demo try running "cat test.lua" and then "lua test.lua".