The fork+exec is efficient. The blog post compares things without units. Forks (principally page table copies w/copy-on-write in effect) are measured in microseconds and the exec is your standard binary startup time. While you don't want to put a synchronous fork/exec in the way of 5,000 reqs/sec, it will be a trivial part of your asynchronous imagemagick processing.
At scale, you might care about the imagemagick startup latency, but not the forking.
At scale, you might care about the imagemagick startup latency, but not the forking.