The other comment mentions computed Goto's. I have used those with big success when writing streaming parsers that work per-char (like a typical JSON or csv-parser). I can't remember exactly, but I think switching from a switch statement to computed gotos cut something like 10% from execution time due to better branch predictions in the CPU. That is pretty huge, considering it was a pretty small change to the char dispatch.
Do you mean a switch statement in pure-python that avoids certain opcodes, or a way to avoid executing opcodes in the main interpreter switch statement?
"whenever somebody gets a deeper understanding of monads, they immediately lose the ability to explain it to other"
I don't remember where I've read this but it still holds even today.