frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•7mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

How Kit Kat Was Killed: Video Shows What a Robot Taxi Couldn’t See

https://www.nytimes.com/2025/12/05/us/waymo-kit-kat-san-francisco.html
1•rl3•7m ago•0 comments

Quick Reminder: ADHD-Friendly

https://apps.apple.com/us/app/quick-reminder-adhd-friendly/id6756063126?mt=12
2•mraduldeodhiya•7m ago•1 comments

'We've been eating stink bugs for over 100 years'

https://www.theguardian.com/environment/2025/nov/21/turning-stink-bug-infestations-into-lunch-ind...
1•bookofjoe•10m ago•0 comments

Linux kernel source tree – with OpenPaX patch (rebased onto Linux-6.18.y)

https://github.com/quinndiggity/linux-openpax
1•openpax•13m ago•0 comments

Nook Browser

https://browsewithnook.com
2•ray__•17m ago•0 comments

Understanding Hytale Player Count: How to Find Active Servers

https://hytaletop100.com/blog/understanding-hytale-player-count-how-to-find-active-servers
1•doobie12•25m ago•0 comments

Visual Neuroscience: How Do Moths See to Fly at Night?

https://www.sciencedirect.com/science/article/pii/S0960982216000701#
1•andsoitis•28m ago•0 comments

Albert Michelson's Harmonic Analyzer [pdf]

https://engineerguy.com/fourier/pdfs/albert-michelsons-harmonic-analyzer.pdf
5•o4c•28m ago•2 comments

PalmOS on FisherPrice Pixter Toy

https://dmitry.gr/?r=05.Projects&proj=27.%20rePalm#pixter
1•dmitrygr•32m ago•0 comments

Have I Been Flocked? – Check If Your License Plate Is Being Watched

https://haveibeenflocked.com/
3•pkaeding•33m ago•0 comments

Show HN: Flexy – We Built a Faster Way to Get Small Dev Tasks Done

https://www.flexytasks.dev/
1•plakhlani2•34m ago•1 comments

Angel

https://en.wikipedia.org/wiki/Angel
1•marysminefnuf•34m ago•0 comments

Show HN: A small reasoning engine that learns rewrite rules from two examples

3•heavymemory•38m ago•2 comments

Jolla Launches Community-Funded Linux Phone

https://linuxiac.com/jolla-launches-community-funded-linux-phone/
4•embedding-shape•52m ago•1 comments

Building PagerDuty's SRE Agent

https://www.pagerduty.com/eng/context-over-cleverness-building-pagerdutys-sre-agent/
1•dadbod80•54m ago•0 comments

Nice overview of ESP32 devkits and ecosystems [video]

https://www.youtube.com/watch?v=sM34IYTIPyQ
3•NoxiousPluK•56m ago•3 comments

Supreme Court Agrees to Review Trump Order Restricting Birthright Citizenship

https://www.nytimes.com/2025/12/05/us/politics/supreme-court-trump-birthright-citizenship.html
3•treetalker•57m ago•2 comments

Show HN: Radioactive Pooping Knights

https://minichessgames.com/#/play/pooping-knights
3•patrickdavey•1h ago•1 comments

Mock Roles, not Objects (2004) [pdf]

https://jmock.org/oopsla2004.pdf
1•n3t•1h ago•0 comments

I cracked a $200 software protection with xcopy

https://www.ud2.rip/blog/enigma-protector/
18•vmfunc•1h ago•4 comments

New Study Investigates How Diet May Slow Normal Brain Aging

https://www.bumc.bu.edu/camed/news-events/articles/2025/new-study-investigates-how-diet-may-slow-...
1•gslin•1h ago•0 comments

EU hits X with €120M fine for breaching the Digital Services Act

https://www.dw.com/en/eu-imposes-120-million-fine-on-elon-musks-x-for-breaking-digital-rules/a-75...
25•vincvinc•1h ago•6 comments

Windows Telemetry

https://sizeof.cat/post/windows-telemetry/
3•DustinEchoes•1h ago•0 comments

India backs off from requiring government-made security app

https://www.scworld.com/news/india-backs-off-from-requiring-government-made-security-app
2•Bender•1h ago•1 comments

Sandvik gets €500M from European Investment Bank for new, smart EVs

https://electrek.co/2025/12/05/sandvik-gets-e500m-from-european-investment-bank-for-new-smart-evs/
1•Bender•1h ago•0 comments

Ultrablack wool textiles inspired by hierarchical avian structure

https://www.nature.com/articles/s41467-025-65649-4
4•defrost•1h ago•2 comments

State of decay in self-hosted commenting (code review)

https://bykozy.me/blog/state-of-decay-in-self-hosted-commenting/
1•byko3y•1h ago•0 comments

HTML Kong (2016)

https://www.xn--8ws00zhy3a.com/blog/2016/07/html-kong#f16.1
1•mmulet•1h ago•1 comments

DeslopifAI – Remove AI Slop

4•haeli05•1h ago•0 comments

Show HN: NanoAI – Unified AI Image Workspace (Generation, Inpainting, Upscaling)

https://nanoai.run
1•Li_Evan•1h ago•0 comments