Anonymous
10/16/2025, 3:02:26 AM
No.106903516
>>106902081
if you read the whole article, you'll find that it has a cost. swift string handling is done not on a byte-basis, nor on a codepoint-basis, but on a grapheme-basis. this imply that whatever application you use the output of your program with, will also perform that same handling of grapheme clusters. that has measurable costs because building grapheme clusters is way more complicated than simply iterating over codepoints.
while this is what should be done for proper unicode support, this isn't always done by all programs.
unicode doesn't tell you how your grapheme will end up being displayed. Some applications have bogus unicode handling and will display non-zero width for stray invisible characters, e.g U+200D. Web browsers are notable for doing that.
some applications reserve larger width for CJK ideograms.
even unicode has 'user-defined' ranges where you can put glyphs of undefined sizes that have to be specified in the font used by the underlying application.
if he had run his swift program with `现` it would have likely failed due to the fact that swift doesn't control how his terminal will render that character, therefore won't be able to know it's visual width at the end.
if you read the whole article, you'll find that it has a cost. swift string handling is done not on a byte-basis, nor on a codepoint-basis, but on a grapheme-basis. this imply that whatever application you use the output of your program with, will also perform that same handling of grapheme clusters. that has measurable costs because building grapheme clusters is way more complicated than simply iterating over codepoints.
while this is what should be done for proper unicode support, this isn't always done by all programs.
unicode doesn't tell you how your grapheme will end up being displayed. Some applications have bogus unicode handling and will display non-zero width for stray invisible characters, e.g U+200D. Web browsers are notable for doing that.
some applications reserve larger width for CJK ideograms.
even unicode has 'user-defined' ranges where you can put glyphs of undefined sizes that have to be specified in the font used by the underlying application.
if he had run his swift program with `现` it would have likely failed due to the fact that swift doesn't control how his terminal will render that character, therefore won't be able to know it's visual width at the end.