emoji width bugs mostly come down to how terminals interpret Unicode's "grapheme clusters" vs "codepoints" vs "display cells". emoji isn't one codepoint - it's often multiple joined by zero-width joiners, variation selectors, skin tone modifiers. so the terminal asks wcwidth(), gets 1 or 2, but the actual glyph might render wider or combine into a single shape.
some emoji even change width depending on font. family emoji is like 7 codepoints, shows up as one glyph. most terminals don't track that. they just count codepoints and pray.
unless terminal is using a grapheme-aware renderer and syncs with the font's shaping engine (like freetype or coretext), it'll always guess wrong. wezterm and kitty kinda parse it right often
crackalamoo•46m ago
Yeah, unfortunately I feel like despite all the advances in Unicode tech, my modern terminal (MacOS) still bugs out badly with emojis and certain special characters.
I'm not sure how/when codepoints matter for wcwidth: my terminal handles many characters with more than one codepoint in UTF-8, like é and even Arabic characters, just fine.
mmastrac•37m ago
I'd be happy if we could get terminals to agree on how wide the warning triangle emoji renders. The emoji are certainly useful for scripts, but often widths are such a crapshoot. I cannot add width detection to every bash script I write for every emoji I want to use.
If only there was a standards body that could perhaps spec how these work in terminals.
charcircuit•23m ago
You could ship a terminal with your script. This is how apps like Slack deal with inconsistent handling of standardized content by shipping an embedded chromium.
b0a04gl•1h ago
some emoji even change width depending on font. family emoji is like 7 codepoints, shows up as one glyph. most terminals don't track that. they just count codepoints and pray.
unless terminal is using a grapheme-aware renderer and syncs with the font's shaping engine (like freetype or coretext), it'll always guess wrong. wezterm and kitty kinda parse it right often
crackalamoo•46m ago
I'm not sure how/when codepoints matter for wcwidth: my terminal handles many characters with more than one codepoint in UTF-8, like é and even Arabic characters, just fine.