How?
I think I've maybe occasionally seen "translit." in text used to mark that the following is transliterated, but I could see that being easily glossed over.
Korean -> English makes more sense.
For example, you wouldn't think twice about it if for the Japanese word for washing machine, you not only saw "洗濯機" (which is how it's written in Kanji), but also "sentakuki" or "sentakki" in the search results, because even to non-Japanese speakers it's pretty clear that that's probably the Japanese word for washing machine written with latin character transliteration, and pretty much exactly what you'd say.
With Korean, it looks more jarring, as the input method is apparently very different, and seems to map the keys for unrelated latin letters to Hangul letters? (I have no idea, I don't know anything about Hangul other than it's based on syllables, kind of like Hiragana/Katakana, and apparently very logical.)
More or less, yes. Each Hangul character represents a syllable, and is composed of two or more components (jamo) representing individual phonemes (like vowels or consonants) which make up the syllable. The keys on a Korean keyboard are mapped to those jamo.
Further details: https://en.wikipedia.org/wiki/Korean_language_and_computers
It is probably more like bopomofo keyboard for Chinese
For example, instead of typing “buzhidao” to get 不知道, you just type “bzd” and pick the top suggestion. Since all the phonetic endings are gone, it does look a little cryptic, but it means if you don’t have a pinyin keyboard, you can still type something fast that is highly correlated with your actual phrase.
For example when you’re searching a movie title on your SmartTV; teenage mutant ninja turtles (similarly abbreviated tmnt) becomes rzsg; some Chinese search tools will pick up on this; whether through statistics, fuzzy matching or specific 简拼 (jiǎnpīn) support, I don’t know.
BTW, this happens all the time in Korea, because it's extremely common for someone to type something while forgetting to switch to the correct input method. Try these, for example:
추ㅜ
gozjsbtm
elwmsl
vkdlTjsyou can also swear in a comedic way by just typing the Hangul sequence in Latin e.g. tlqkf
Hah, this comment is the top result when I searched with StartPage. There are a bunch of Korean results though.
https://trends.google.com/trends/explore?date=all&q=frqnce&h...
You'll notice it peaks every northern hemisphere summer. On French keyboards, Q and A are reversed compared to US keyboards, and every summer, millions of French people go on vacation, and start Google searching for things back home on unfamiliar keyboards.
It declines with the rise of the smartphone, as they're bringing their keyboards with them.
Why it suddenly spikes in the last few years, I don't know.
Haven’t finished the article yet but this jumped out at me. This doesn’t ring true to me. Google runs an extortion scheme - since you can buy ads on your competitors’ trademarks, and since no users can tell ads from results (and since the organic results are now buried so far, they rarely get clicks anyway) if you don’t buy your brand keywords your competitors will get all your traffic.
As others have said, keyboard mismatches are common enough that Google might have built out logic for it specifically. But thats not necessary and even “old school” search engines could learn these things.
The first time “alemwjsl” is searched you might not have any data, but the user will probably fix their keyboard and retype in Korean. That gives you a query correction mapping. And you can assume if query1 yields no clicks and they update to query2, q1 is a synonym for q2 and serve results for q2 instead.
Then, if a session contains a query “alemwjsl” and a click on midjourney.com and another session “midj” also contains a click on midjourney.com, those are co-clicked queries.
You can also even start to represent queries by the words in their associated clicked documents or vice versa. This helps to get around the fact that people might search “how much superbowl tickets” and “superbowl tickets price” but the official page might not contain either of those strings.
Of course there’s more advanced methods now (neural nets) but it’s cool to see how it worked in the past.
Also, for people that don’t use bilingual keyboards this is a pretty interesting finding.
I've got nothing to add there that people haven't already been saying - this was a fascinating quirk of humanity and technology. Really good full-circle adventure uncovering the source.
I'm commenting because I have to know what you're doing with your website and blog. It looks like a markdown/obsidian/static site generator. It's gorgeous and amazing. Did you write it yourself? Is it open source software?
The string "alemwjsl" is a classic example of a keyboard input error specific to Korean users. Here is the explanation: The Hypothesis: The "Han/Yeong" (Korean/English) Toggle Error In South Korea, keyboards are bilingual. Users frequently switch between the Korean script (Hangul) and English (QWERTY) using a toggle key. If a user intends to type the Korean word for Midjourney (미드저니) but forgets to toggle the keyboard input from English to Korean, the output corresponds to the physical location of the keys on a standard QWERTY layout. The Proof (Mapping the Keys) Let’s break down the Korean word 미드저니 (Midjourney) key by key on a standard "2-Set" Korean keyboard: 미 (Mi) ㅁ corresponds to the A key. ㅣ corresponds to the L key. Result: al 드 (Deu) ㄷ corresponds to the E key. ㅡ corresponds to the M key. Result: em 저 (Jeo) ㅈ corresponds to the W key. ㅓ corresponds to the J key. Result: wj 니 (Ni) ㄴ corresponds to the S key. ㅣ corresponds to the L key. Result: sl Put it all together: al + em + wj + sl = alemwjsl Why this happens and why they bid on it Muscle Memory: Midjourney is a very popular search term in Korea (113K volume for the main keyword). Thousands of users type it quickly without looking at the screen. By the time they realize they are typing in English mode, they have already hit enter or the search bar has auto-suggested the "gibberish" term. Smart SEO/SEM Strategy: High Intent: Anyone typing "alemwjsl" is 100% looking for "Midjourney." There is no ambiguity. Lower Cost: While "미드저니" might have high competition (CPC 0.31), "alemwjsl" often has lower competition because many advertisers overlook "gibberish" keywords, though in this specific case, the CPC is quite similar (0.28 vs 0.31), indicating the secret is out. Capture All Traffic: By bidding on this, Midjourney ensures that even clumsy typists find their website immediately rather than being redirected by Google to a "Did you mean...?" page or a competitor. Conclusion: "alemwjsl" is simply 미드저니 typed with the keyboard set to English. It represents high-intent users making a very common technical mistake.
Not the first time ChatGPT being inferior in such tasks.
yorwba•1mo ago
Keyboard layout mismatches are common enough that I assume Google has a layout detection stage hardcoded just like they have typo correction hardcoded. And the creators of said algorithms probably understand very well how they work. (The naïve way would be to convert from every possible layout to every other layout, but I think you could build something more lightweight using Hidden Markov Models.)
alisonkisk•1mo ago
nelsondev•1mo ago
rhet0rica•1mo ago