In case anyone else was curious how you create nonblocking file I/O, it appears to use threads.
I am curious if the number of threads is unbounded, or if they have a bounded set but accept deadlocks, or if there is a third option other than those two that I am unaware of.
Except that io_uring is threads running in kernel.
There is no true async I/O on most (if not all) current platforms - it's all threads, either in user space or in kernel space. Sometimes even deliberately, for example polling disk will give better latency compared to waiting for IRQ.
AFAIK Windows handles truly asynchronous buffered IO in some circumstances, but I feel once you're past the point of managing the abstraction or caring about its internal details, it doesn't really matter if there is a tiny chunk of dedicated stack in the kernel, that's a problem for the OS
IOCP uses a pool of quasi-kernel threads (i.e. schedulable entity with a contiguous stack for storing state) with polling, very much like how io_uring and other incarnations of AIO in the Linux kernel work; and for that matter it's not unlike how purely user space AIO implementations work. The benefit of IOCP and io_uring is there's one less buffer copying operation. The biggest benefit of IOCP, really, is that it's a blackbox that you can depend on, and one that everybody is expected to depend upon. So it can be whatever you want it to be ;)
Is there any further documentation for it? I would have expected there doesn't need to be a real stack. Only state-machines for all the IO entities (like sockets) which get advanced whenever an outside event (e.g. interrupt) happens and which then signal the IO completion towards userspace. Didn't expect that it's necessary to keep stacks around.
Yes, IOCP looks like true async I/O most of the time. While in reality it will block if file is cached, in cases if code tries to read something like 100+ MB at a time ReadFile calls can take 200ms+. So most "async I/O" frameworks have to wrap IOCP into a user space thread ...
If by aio you mean Posix aio - on Linux it's implemented with user space threads and blocking I/O. Posix aio on BSD systems is implemented as kernel space thread (aio_write/etc are syscalls on BSD, and glibc functions on Linux).
If you mean io_submit, then yes, but in vast majority of cases, actual `io_submit` syscall will block, because of metadata updates, unaligned reads, etc ...
Are those really the only options? I'm trying to wrap my head around how using a fixed size thread pool for I/O automatically implies deadlocks but I just can't. Unless the threads block on completion until their results are consumed instead of just notifying and then taking the next task..
I can definitely imagine blocking happening while waiting for a worker to be available, though. Did you mean simply blocking instead of deadlock?
Thank you for humoring me. I had to sleep on it, but I can see it now. Seems like it would require a really bad design or more likely bad actors (remotes leaving dead sockets open), but it would definitely be possible.
The same scenarios would lead to resource exhaustion if the thread pool wasn't bounded.
I am curious if the number of threads is unbounded, or if they have a bounded set but accept deadlocks, or if there is a third option other than those two that I am unaware of.