> You can't effectively use one character set well for everything; different app...

zzo38computer · on Dec 3, 2019

> Or, just maybe, strings in the file could be Unicode, encoded in say UTF-8, so that the handling of all of them are uniform...

Actually, that won't work. There are cases where a character may be different according to the language, where capitalization may differ depending on the language, where sort order may depend on the language, etc.

stefco_ · on Dec 3, 2019

If your application is allowing users to edit the text, or if you know which languages will be used, or if you don't care about capitalization, then you don't have to worry about any of those edge cases, and Unicode is useful.

kccqzy · on Dec 3, 2019

Unicode solves all that. It has case folding rules to handle capitalization differences. It has collation rules to handle sorting differences.