> Or, just maybe, strings in the file could be Unicode, encoded in say UTF-8, so...

stefco_ · on Dec 3, 2019

If your application is allowing users to edit the text, or if you know which languages will be used, or if you don't care about capitalization, then you don't have to worry about any of those edge cases, and Unicode is useful.

kccqzy · on Dec 3, 2019

Unicode solves all that. It has case folding rules to handle capitalization differences. It has collation rules to handle sorting differences.