int run_switches(const char *buf) {
size_t len = strlen(buf);
int res = 0;
for (size_t i = 0; i < len; ++i) {
res += (buf[i] == 's') - (buf[i] == 'p');
}
return res;
}
strlen() should be implemented in a pretty fast way, and after the buffer size is known, the compiler can autovectorize the inner loop, which does happen in practice: https://gcc.godbolt.org/z/qYfadPYoq