I've made various visualizations, tried to analyze names that are going to be popular in the future, etc. I recently tried making a name recommendation system that lets you rate names, then recommends names based on your inferred preferences. It worked OK, but along the way I ended up making namex[1], which has turned out fun and maybe useful enough to be worth sharing with a wider audience. Give it a try!
Basic overview:
- Corpus of ~24,000 names taken from SSA data
- Any name with 15+ registrations attributed to it from 2022-24 was included
- The names are scored by an LLM (Claude Sonnet 4.5) across ~40 subjective dimensions, such as "toughness", "trendiness", "easy to spell"- A further ~15 dimensions are computed based off of the (LLM inferred) pronunciation of the name
- E.g. "vowel rich", "ends nasal", "syllable count"
- 3 dimensions are computed from the raw SSA data related to popularity and gender distribution- Names are then represented by a 60-dimensional vector
- User selections create a weight vector
- Names are ranked against the weight vector using fancy linear algebra (or, uh, dot products)
- Static data is loaded from server, everything else is run client-side
Disclaimer: there is some potential for offense to be taken at the characterization of names. The LLM was instructed to score the subjective dimensions according to the American cultural context. So what is considered e.g. easy to spell, or associations with certain cultures or religions, is based on the LLM's interpretation of that. So there are probably biases that stem from the LLM's training or American culture, or likely both.
rushil_b_patel•1h ago
Nice work