frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Han – A Korean programming language written in Rust

https://github.com/xodn348/han
64•xodn348•2h ago
A few weeks ago I saw a post about someone converting an entire C++ codebase to Rust using AI in under two weeks.

That inspired me — if AI can rewrite a whole language stack that fast, I wanted to try building a programming language from scratch with AI assistance.

I've also been noticing growing global interest in Korean language and culture, and I wondered: what would a programming language look like if every keyword was in Hangul (the Korean writing system)?

Han is the result. It's a statically-typed language written in Rust with a full compiler pipeline (lexer → parser → AST → interpreter + LLVM IR codegen).

It supports arrays, structs with impl blocks, closures, pattern matching, try/catch, file I/O, module imports, a REPL, and a basic LSP server.

This is a side project, not a "you should use this instead of Python" pitch. Feedback on language design, compiler architecture, or the Korean keyword choices is very welcome.

https://github.com/xodn348/han

Comments

raaspazasu•1h ago
I don't know Korean at all, but this looks cool and a fun project. I'm curious if this reduces typing or has any benefits being in Hangul vs Latin?
xodn348•1h ago
Thanks! One thing that motivated me was curiosity about prompt efficiency in the AI era. Hangul is beautifully dense — a single syllable block packs initial consonant + vowel + final consonant into one character. I wondered if Korean-keyword code might produce shorter prompts for LLMs.

I actually tested this with GPT-4o's tokenizer, and the result was the opposite — Korean keywords average 2-3 tokens vs 1 for English. A fibonacci program in Han takes 88 tokens vs 54 in Python.

The reason comes down to how LLM tokenizers work. They use BPE (Byte Pair Encoding), which starts with raw bytes and repeatedly merges the most frequent pairs into single tokens. Since training data is predominantly English, words like `function` and `return` appear billions of times and get merged into single tokens.

Korean text appears far less frequently, so the tokenizer doesn't learn to merge Hangul syllables — it falls back to splitting each character into 2-3 byte-level tokens instead.

It's a tokenizer training bias, not a property of Hangul itself. If a tokenizer were trained on a Korean-heavy corpus, `함수` could absolutely become a single token too.

So no efficiency benefit today. But it was a fun exploration, and Korean speakers can read the code like natural language. It could also be a fun way for people learning Korean to practice reading Hangul in a different context — every keyword is a real Korean word with meaning.

topce•1h ago
Very Interesting...

I have similar idea to train LLM in Serbian, create even new encoding https://github.com/topce/YUTF-8 inspired by YUSCII. Did not have time and money ;-) Great that you succeed. Idea if train in Serbian text encoded in YUTF-8 (not UTF-8) it will have less token when prompt in Serbian then English, also Serbian Cyrillic characters are 1 byte in YUTF-8 instead of 2 in UTF.Serbian language is phonetic we never ask how you spell it.Have Latin and Cyrillic letters.

xodn348•53m ago
Really interesting approach — attacking token efficiency at the encoding level is more fundamental than what I did.

Even without retraining BPE from scratch, starting with YUTF-8 and measuring how existing tokenizers handle it would already be a worthwhile experiment.

Hope you find the time to build it, good luck!

ralferoo•39m ago
I don't know how to read Hangul (I know the general idea about how the character is composed). To me just looking at the examples, it doesn't seem as obvious what the structure of the code is, compared to Latin letters and punctuation. Actually, most punctuation looked OK, but the first couple of examples used arrays and [ and ] seemed to just blend in with the identifiers wherever they appeared. I'm not sure how distinct they look with familiarity with Hangul characters. I'm sure it's also nothing that colour syntax highlighting wouldn't make easier.
xodn348•31m ago
Fair point that [ ] can blend in.

For Korean readers the character systems look quite different, but I can see how it's hard to parse visually without familiarity.

As you said, syntax highlighting helps a lot — there's a colored screenshot at the top of the README showing how it looks in practice.

bbrodriguez•54m ago
Korean doesn’t reduce typing compared to English from my experience. What looks like a “character” is actually a syllable block called “eumjeol” that’s made up of consonants (moeum)and vowels (jaeum). You can’t have a vowel only syllable either so you always have to pair it with a null consonant no matter what (which kinda looks like a zero: ㅇ) and while nouns can be much more concise compared to English, verbs can get verbose.

The main benefit of Korean actually comes from the fact that the language itself fits perfectly into a standard 27 alphabet keys and laid out in such a way that lets you type ridiculously fast. The consonant letters are always situated in the left half and the vowels are in the right half of the keyboard. This means it is extremely easy to train muscle memory because you’re mostly alternating keystrokes on your left hand and right hand.

Anecdotally I feel like when I’m typing in English, each half of my brain needs to coordinate more compared to when I’m typing in Korean, the right brain only need to remember the consonant positions for my left hand and my left brain only need to remember the vowel positions.

xodn348•47m ago
만나서 반가워요!

What you talked is mostly right and I did not know about typing in Korean, the left-hand side and right-hand side. Btw, Consonant(Jaeum) and vowel(Moeum).

In experience-wise, what you had would be precise.

danparsonson•32m ago
Wonderful! What a cool idea. For anyone interested, you can learn the whole of Hangul in an afternoon; it's cleverly designed to be very logical and has some handy mnemonics: https://korean.stackexchange.com/a/213
xodn348•28m ago
That is a deep knowledge that even Korean-natives would not know. I will add this site as a reference to Github. I am glad that I have you as a supporter!
xodn348•25m ago
Just added that link to the README — it fits perfectly in the "Beauty of Hangul" section.
apt-apt-apt-apt•30m ago
A simple translation of keywords seems straightforward, I wonder why it's not standard.

    # def two_sum(arr: list[int], target: int) -> list[int]:
    펀크 투섬(아래이: 목록[정수], 타개트: 정수) -> 목록[정수]:
    # n = len(arr)
    ㄴ = 길이(아래이)

    # start, end = 0, n - 1
    시작, 끝 = 0, ㄴ - 1
    # while start < end:
    동안 시작 < 끝:
Code would be more compact, allowing things like more descriptive keywords e.g. AbstractVerifiedIdentityAccountFactory vs 실명인증계정생성, but we'd lose out on the nice upper/lowercase distinction.

I hear that information processing speed is nearly the same across all languages though regardless of density, so in terms of processing speed, may not make much difference.

xodn348•21m ago
Good point about compactness — 실명인증계정생성 vs AbstractVerifiedIdentityAccountFactory is a real example where Korean shines.

One distinction though: Han uses actual Korean words, not transliterations. 함수 means "function" in Korean, 만약 means "if" — they're real words Korean speakers already know.

Your example uses transliterations like 펀크 and 아래이 which would look odd to a Korean reader. That difference matters for readability.

xodn348•21m ago
funny examples, though.
marysminefnuf•26m ago
My dream is to one day make a chaldean programming language for my kids. Stuff like this is inspiring
xodn348•24m ago
The fact that you're already thinking about it means you can do it. Go for it!
water_badger•19m ago
fun fact, you can easily write c in any language you want through the power of macros

https://github.com/farant/rhubarb/blob/main/include/latina.h

edit: oh, maybe you can’t do full unicode. that’s too bad!

xodn348•11m ago
Ha, neat trick. But macro substitution and a purpose-built language are very different — Han has a full pipeline (lexer → parser → AST → interpreter + LLVM codegen) designed around Korean from the ground up.

Error messages, REPL, LSP hover docs are all in Korean. You can't get that from #define 만약 if.

water_badger•5m ago
yeah, making a whole language is way more impressive!

anecdotally it is also interesting to use with ai because apparently it is "harder to be on autopilot" based on a huge pre-existing corpus of code when you write it in a different language. could activate different reasoning regions somehow.

(i just appreciate what can be trivially accomplished in c even if it's kind of janky after spending way too much time in the JS preprocessor mines...)

technol0gic•17m ago
i only code in this when no ones around. one might say I...han solo
xodn348•13m ago
Force for good be with you

Show HN: Han – A Korean programming language written in Rust

https://github.com/xodn348/han
64•xodn348•2h ago•22 comments

Show HN: Ichinichi – One note per day, E2E encrypted, local-first

53•katspaugh•4h ago•20 comments

Show HN: GitAgent – An open standard that turns any Git repo into an AI agent

https://www.gitagent.sh/
84•sivasurend•9h ago•11 comments

Show HN: Learn Arabic with spaced repetition and comprehensible input

https://abjadpro.com
60•adangit•7h ago•12 comments

Show HN: Costly – Open-source SDK that audits your LLM API costs

https://www.getcostly.dev/
3•itsdannyt•1h ago•1 comments

Show HN: I built an open-source agent-run trading fund with real capital

https://github.com/CrunchyJohnHaven/elastifund
2•h16zed•1h ago•1 comments

Show HN: Replacing $50k manual forensic audits with a deterministic .py engine

2•cd_mkdir•2h ago•1 comments

Show HN: AI coding agent for VS Code with pay-as-you-go pricing- no subscription

https://www.llmonestop.com/#pricing
2•hhossain•2h ago•0 comments

Show HN: ZaneOps, A beautiful and fast self hosted alternative to Vercel

https://zaneops.dev/
2•fredkisss•2h ago•1 comments

Show HN: ngrep – grep plus word embeddings (Rust)

https://github.com/0xNaN/ngrep
3•xnan•2h ago•2 comments

Show HN: Cloak – send and receive secrets from OpenClaw

https://cloak.opsy.sh
3•d36ugger•2h ago•0 comments

Show HN: Json.express – Query and explore JSON in the browser, zero dependencies

https://json.express
2•udidu•2h ago•0 comments

Show HN: Pidrive – File storage for AI agents (mount S3, use ls/cat/grep)

https://pidrive.ressl.ai/
3•abhishek203r•2h ago•0 comments

Show HN: Data-anim – Animate HTML with just data attributes

https://github.com/ryo-manba/data-anim
9•ryo-manba•8h ago•1 comments

Show HN: Ink – Deploy full-stack apps from AI agents via MCP or Skills

https://ml.ink/
27•august-•3d ago•4 comments

Show HN: Paperctl- An Arxiv CLI designed for agents

https://github.com/ChristianFJung/paperctl
2•christianjung•3h ago•1 comments

Show HN: Language Life – Learn a language by living a simulated life

https://www.languagelife.ai
4•bitforger•3h ago•1 comments

Show HN: KeyID – Free email and phone infrastructure for AI agents (MCP)

https://keyid.ai/
8•vasilyt•7h ago•8 comments

Show HN: Channel Surfer – Watch YouTube like it’s cable TV

https://channelsurfer.tv
578•kilroy123•3d ago•169 comments

Show HN: Context Gateway – Compress agent context before it hits the LLM

https://github.com/Compresr-ai/Context-Gateway
89•ivzak•1d ago•50 comments

Show HN: I built Wool, a lightweight distributed Python runtime

https://github.com/wool-labs/wool
10•bzurak•10h ago•3 comments

Show HN: Zap Code – AI code generator that teaches kids real HTML/CSS/JS

https://www.zapcode.dev
9•eibrahim•3h ago•2 comments

Show HN: Auto-Save Claude Code Sessions to GitHub Projects

https://github.com/ej31/claude-session-tracker
2•ej31•5h ago•0 comments

Show HN: What was the world listening to? Music charts, 20 countries (1940–2025)

https://88mph.fm/
108•matteocantiello•4d ago•48 comments

Show HN: Axe – A 12MB binary that replaces your AI framework

https://github.com/jrswab/axe
219•jrswab•2d ago•122 comments

Show HN: Hedra – an open-world 3D game I wrote from scratch before LLMs

https://github.com/maxilevi/project-hedra
4•maxilevi•8h ago•0 comments

Show HN: SupplementDEX – The Evidence-Based Supplement Database

https://supplementdex.com/
13•richarlidad•22h ago•0 comments

Show HN: OneCLI – Vault for AI Agents in Rust

https://github.com/onecli/onecli
160•guyb3•2d ago•50 comments

Show HN: BirdDex – Pokémon Go, but with real life birds

https://birddex.co/
3•stellay•9h ago•1 comments

Show HN: QKD eavesdropper detector using Krylov complexity-open source Python

https://github.com/quantumspiritresearch-crypto/qkd-krylov-detector
3•QuantumSpirit•10h ago•0 comments