Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I can probably help but what are you looking for?

There are three areas of subject matter here (Go, Zig, SQLite).

Our affected deployment was an unusually large vertical scale SQLite deployment, with the part causing primary concern hitting >48 fully active cores and struggling to maintain 16k iops, large read workload you can think of almost like slowly slurping a whole set of tables, with a concurrent write workload updating mostly existing rows. Lots of json extension usage.

Something important to add here: Go code is largely, possibly entirely unaffected by Zigs intrinsics, as it has its own symbols and implementations, but I didn't check if the linker setup ended up doing anything else fancy/screwy to take over additional stuff.



Sorry for not replying sooner! Tbh, idk.

I just had a user come up to me a particular query that was relatively much slower on my driver than others, and it unexpectedly turned out that it was the way I implemented context cancellation for long running queries.

Fixing the issue lead to a two fold improvement, which I wouldn't have figured out without something to measure.

Now obviously, I can't ask you for the source code of your huge production app. But if you could give hints that would help build a benchmark, that's the best way to improve.

I just wonder which compiler-rt builtins make such a huge difference (and where), and how do they end up when going through Wasm and the wazero code generator.

Maybe I could star by compiling speedtest1 with zig cc, see if anything pops up.


So we aren't using mmap (but we are using WAL), so for one thing there's a lot of buffer and cache copy work going on - I'd expect it's the old classics from string.h. WASM is a high level VM though so I'd expect (provided you pass the relevant flags) it should be using e.g. the VMs memory.copy instruction which will bottom out in e.g. v8s memcpy.

we also don't use a common driver, ours is here https://github.com/tailscale/sqlite

sqlite in general should be mostly optimized for the targets it's often built with, so for example it'll expect everything in string.h to be insanely well optimized, but it should also expect malloc to be atrocious and so it'll manage it's own pools.


Oh tailscale! Hi! And thanks for this!

Yeah, my Wasm uses bulk memory instructions which wazero bottoms out to Go's runtime.memmove: memory.init, memory.copy, etc are all just runtime.memmove; memory.fill fills a small segment, then runtime.memmove it to exponentially larger segments; etc.

memcmp is probably slow though, the musl implementation is really naive. I shall try a simple optimization.


I got some nice 10x improvements, but they didn't really show up in benchmarks of SQLite.

I'm guessing it's mostly memset and memcpy that matter.

https://github.com/ncruces/go-sqlite3/issues/257#issuecommen...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: