frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•7mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Robot Vacuum Roomba Maker Files for Bankruptcy After 35 Years

https://news.bloomberglaw.com/bankruptcy-law/robot-vacuum-roomba-maker-files-for-bankruptcy-after...
1•nreece•46s ago•0 comments

Skövde, the tiny town powering up Sweden's video game boom

https://www.theguardian.com/games/2025/dec/12/skovde-sweden-video-games-goat-simulator-valheim-v-...
1•1659447091•1m ago•0 comments

Microsoft Copilot AI Comes to LG TVs, and Can't Be Deleted

https://www.techpowerup.com/344075/microsoft-copilot-ai-comes-to-lg-tvs-and-cant-be-deleted
1•akyuu•1m ago•0 comments

TV in America, Pt. 1 – Foundations

https://drmanhattan16.substack.com/p/the-history-of-tv-in-america-pt-1
1•paulpauper•2m ago•0 comments

Oliver Sacks fabricated key details in his books

https://boingboing.net/2025/12/12/oliver-sacks-fabricated-key-details-in-his-books.html
2•paulpauper•5m ago•0 comments

Frances Elizabeth Allen: The Woman Who Made Code Run Fast – and Was Forgotten

https://voxmeditantis.com/2025/12/13/frances-elizabeth-allen-the-woman-who-made-code-run-fast-and...
2•colinprince•7m ago•0 comments

Being There: On Working in Person

https://medium.com/@maspinwall22/being-there-5c167dd8b163
1•govmaspy•7m ago•0 comments

Ask HN: Best back end to run models on Google TPU?

2•vood•13m ago•0 comments

Grok is spreading misinformation about the Bondi Beach shooting

https://www.theverge.com/news/844443/grok-misinformation-bondi-beach-shooting
2•alsetmusic•14m ago•1 comments

Ravaan.art

https://ravaan.art/?seed=71dafa3svng
2•nateb2022•16m ago•0 comments

Sam Altman's Sprint to Correct OpenAI's Direction and Fend Off Google

https://www.wsj.com/tech/ai/openai-sam-altman-google-code-red-c3a312ad
1•babelfish•16m ago•1 comments

Larry Wall, the Guru of Perl (1999)

https://www.linuxjournal.com/article/3394
1•susam•17m ago•0 comments

If AI replaces workers, should it also pay taxes?

https://english.elpais.com/technology/2025-11-30/if-ai-replaces-workers-should-it-also-pay-taxes....
4•PaulHoule•20m ago•0 comments

UK Treasury drawing up new rules to police cryptocurrency markets

https://www.theguardian.com/technology/2025/dec/15/uk-treasury-drawing-up-new-rules-to-police-cry...
2•chrisjj•21m ago•0 comments

L5: A Processing Library in Lua for Interactive Artwork

https://l5lua.org/
2•azhenley•22m ago•0 comments

A Year of Not Really Blogging

https://duggan.ie/posts/a-year-of-not-really-blogging
1•duggan•23m ago•0 comments

Adding Bits Beats AI Slop

https://gwern.net/blog/2025/good-ai-samples
2•networked•23m ago•0 comments

JSDoc types are not TypeScript types

https://jcbhmr.com/2024/12/24/jsdoc-is-not-ts/
3•jcbhmr•24m ago•0 comments

Whisper-Turbo – Cross-Platform, GPU Accelerated Whisper

https://github.com/FL33TW00D/whisper-turbo
1•montyanderson•24m ago•0 comments

Scripting on the Lido Deck (2000)

https://web.archive.org/web/20160307004219/http://www.wired.com/2000/10/cruise/
1•susam•26m ago•0 comments

Marc Andreessen and Charlie Songhurst on the past, present, and future [video]

https://www.youtube.com/watch?v=E_1cTlLpNMg
1•montyanderson•28m ago•0 comments

If you hate networking, you're probably bad at it

https://adelwu.substack.com/p/if-you-hate-networking-youre-probably
2•swyx•29m ago•0 comments

The World Is Not a Desktop (1994)

https://dl.acm.org/doi/pdf/10.1145/174800.174801
3•todsacerdoti•33m ago•0 comments

Microsoft AI

https://microsoft.ai/
3•gmays•33m ago•0 comments

I Built an App to Talk to My Dad

https://chadnauseam.com/coding/random/i-built-an-app-to-talk-to-my-dad
2•ChadNauseam•40m ago•0 comments

Breast Cancer Prediction Dashboard · Streamlit

https://breast-cancer-prediction-project-xlaymqx3l7jvnhhhsvjbh8.streamlit.app
2•yasminealiosman•44m ago•0 comments

Mesa shuts down credit card that rewarded cardholders for paying their mortgages

https://techcrunch.com/2025/12/14/mesa-shuts-down-credit-card-that-rewarded-cardholders-for-payin...
6•jnord•44m ago•0 comments

Clean, Limitless Energy Exists. China Is Going Big in the Race to Harness It

https://www.nytimes.com/2025/12/13/climate/china-us-fusion-energy.html
5•donohoe•46m ago•1 comments

Overview of the Memory Market in Mid-December 2025

https://hanchouhsu.substack.com/p/overview-of-the-memory-market-in
1•walterbell•46m ago•0 comments

Our emotional pain became a product

https://www.theguardian.com/us-news/ng-interactive/2025/dec/14/trauma-mental-health
10•worik•46m ago•2 comments