I would consider splitting this task into two:
- extracting the next Unicode code unit
- determining whether it’s in the code class
For the second, instead of using an automaton, one could use a perfect hash (https://en.wikipedia.org/wiki/Perfect_hash_function). That could make that part branch-free.
Is that a good idea?
mroche•2w ago
I have to say I am surprised about that. Does anyone have any context or guesses as to why this is the case?
EDIT: Go's unicode was actually updated to v17 yesterday:
https://github.com/golang/go/commit/dd39dfb534d2badf1bb2d72d...
watchful_moose•2w ago
matt3210•2w ago
neild•2w ago
Go is pretty much entirely developed in public; there are some Google-internal customizations but none of them are particularly exciting and almost all changes start in the open source repo and are imported from there.
LukeShu•2w ago
8n4vidtmkvmk•2w ago
tonfa•2w ago
https://www.gerritcodereview.com/about.html
fsmv•2w ago