In case anyone else was curious how you create nonblocking file I/O, it appears ...

Argorak · on Aug 16, 2019

Note that the implementation of the runtime itself is subject to change. It's currently an unbounded threadpool.

https://github.com/async-rs/async-std/pulls?utf8=%E2%9C%93&q...

slovenlyrobot · on Aug 16, 2019

io_uring grew support for buffered IO in recent kernels, so we should have widespread support for this in userspace circa 2025

dmytroi · on Aug 16, 2019

Except that io_uring is threads running in kernel.

There is no true async I/O on most (if not all) current platforms - it's all threads, either in user space or in kernel space. Sometimes even deliberately, for example polling disk will give better latency compared to waiting for IRQ.

slovenlyrobot · on Aug 16, 2019

AFAIK Windows handles truly asynchronous buffered IO in some circumstances, but I feel once you're past the point of managing the abstraction or caring about its internal details, it doesn't really matter if there is a tiny chunk of dedicated stack in the kernel, that's a problem for the OS

wahern · on Aug 16, 2019

IOCP uses a pool of quasi-kernel threads (i.e. schedulable entity with a contiguous stack for storing state) with polling, very much like how io_uring and other incarnations of AIO in the Linux kernel work; and for that matter it's not unlike how purely user space AIO implementations work. The benefit of IOCP and io_uring is there's one less buffer copying operation. The biggest benefit of IOCP, really, is that it's a blackbox that you can depend on, and one that everybody is expected to depend upon. So it can be whatever you want it to be ;)

Matthias247 · on Aug 17, 2019

> OCP uses a pool of quasi-kernel threads

Is there any further documentation for it? I would have expected there doesn't need to be a real stack. Only state-machines for all the IO entities (like sockets) which get advanced whenever an outside event (e.g. interrupt) happens and which then signal the IO completion towards userspace. Didn't expect that it's necessary to keep stacks around.

dmytroi · on Aug 16, 2019

Yes, IOCP looks like true async I/O most of the time. While in reality it will block if file is cached, in cases if code tries to read something like 100+ MB at a time ReadFile calls can take 200ms+. So most "async I/O" frameworks have to wrap IOCP into a user space thread ...

for_xyz · on Aug 16, 2019

Or disable caching for specified file when calling CreateFile API

https://docs.microsoft.com/en-us/windows/win32/api/fileapi/n...

dboreham · on Aug 16, 2019

Also VMS..

pas · on Aug 16, 2019

O_DIRECT + aio on Linux seems okay for preallocated files, no?

dmytroi · on Aug 17, 2019

If by aio you mean Posix aio - on Linux it's implemented with user space threads and blocking I/O. Posix aio on BSD systems is implemented as kernel space thread (aio_write/etc are syscalls on BSD, and glibc functions on Linux).

If you mean io_submit, then yes, but in vast majority of cases, actual `io_submit` syscall will block, because of metadata updates, unaligned reads, etc ...

pas · on Aug 22, 2019

I initially wrote libaio but then thought it would just confuse people. :)

Yes, I mean io_submit, which is what MySQL uses.

monocasa · on Aug 16, 2019

Or just detect it and swap out the implementation. Less than a year until an Ubuntu LTS that has it.

slovenlyrobot · on Aug 17, 2019

I should have written 2035 to make the sarcasm a little clearer :)

rmgraham · on Aug 16, 2019

Are those really the only options? I'm trying to wrap my head around how using a fixed size thread pool for I/O automatically implies deadlocks but I just can't. Unless the threads block on completion until their results are consumed instead of just notifying and then taking the next task..

I can definitely imagine blocking happening while waiting for a worker to be available, though. Did you mean simply blocking instead of deadlock?

spullara · on Aug 16, 2019

N threads, with N readers waiting for a message that will only come if the N+1 reader (still in the queue) gets a message first.

rmgraham · on Aug 17, 2019

Thank you for humoring me. I had to sleep on it, but I can see it now. Seems like it would require a really bad design or more likely bad actors (remotes leaving dead sockets open), but it would definitely be possible.

The same scenarios would lead to resource exhaustion if the thread pool wasn't bounded.

nine_k · on Aug 17, 2019

But sure one must use an output queue, not synchronously wait for the consumer to consume a result?

spullara · on Aug 17, 2019

The N + 1 readers are all reading different sockets, blocked.

layoutIfNeeded · on Aug 16, 2019

Non-blocking I/O via threads? That’s what we used to call blocking I/O :D