This is kind of just a measurement of how representative a language is in the distribution of the tokenizer training. You could have a single token equal to “public static void main”.
Well, yes, looking beyond token efficiency I find that the more constrained (stronger and richer static typing) the language the better/faster (fewer rounds of editing and debugging, ergo fewer tokens) the LLM deals with it. C is a nightmare.
The BPE or wordpiece tokenization algorithm will greedily take the longest valid token prefix. So if your text starts with “public static void main” it will try to find the longest token which matches that prefix. Even if “public” is a token, it will prefer to tokenize “public static” together.