frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•7mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Harry Potter–Style 'Moving Invisibility Cloak' Technology Developed

https://news.kaist.ac.kr/newsen/html/news/?mode=V&mng_no=56050
1•JeanKage•4m ago•0 comments

What Is Orthokeratology?

https://www.aao.org/eye-health/glasses-contacts/what-is-orthokeratology
3•thunderbong•12m ago•1 comments

'It's an open invasion': how quagga mussels changed Lake Geneva

https://www.theguardian.com/environment/2025/dec/18/invasive-quagga-mussels-lake-geneva-aoe
1•n1b0m•14m ago•0 comments

Nvidia Publishes Complete Evaluation Recipe for Nemotron 3 Nano

https://huggingface.co/blog/nvidia/nemotron-3-nano-evaluation-recipe
1•victormustar•14m ago•0 comments

Prompts Are Broken

https://godofprompt.beehiiv.com/p/your-prompts-are-broken
1•kiyanwang•16m ago•0 comments

Differential Fuzzing Across the Language Divide

https://R9295.github.io/posts/differential-fuzzing-accross-languages/
1•r9295•21m ago•0 comments

Show HN: SuperchargeBrowser – Privacy-first Chrome extension to fix performance

https://github.com/SuperchargeBrowser/supercharge-browser
1•superchargeext•21m ago•1 comments

King William's College – Isle of Man "The World's Most Difficult Quiz" [pdf]

https://kwc.im/wp-content/uploads/2025/12/GKP_2025_26.pdf
2•beardyw•23m ago•5 comments

Why We Should Ringfence Reality Online: Certifying Content

https://inreality.io/reality-online-certifying-content
1•InReality_io•24m ago•0 comments

AI Chatbots Are Poisoning Research Archives with Fake Citations

https://www.rollingstone.com/culture/culture-features/ai-chatbot-journal-research-fake-citations-...
1•LordAtlas•27m ago•0 comments

Advent of Code Quantum Edition: Day 3

https://aqora.io/blog/advent-of-code-quantum-edition-day-3
1•stubbi•29m ago•0 comments

Why Does A.I. Write Like That?

https://www.nytimes.com/2025/12/03/magazine/chatbot-writing-style.html
1•telotortium•30m ago•0 comments

Exa's People Search Benchmarks

https://exa.ai/blog/people-search-benchmark
1•samuel246•31m ago•0 comments

Hetzner AX102 Review: Why DB Need Enterprise NVMe – PLP and Fsync Performance

https://blog.webp.se/hetzner-ax102-review-enterprise-nvme-vs-consumer-ssd-fsync-en/
1•novakwok•34m ago•0 comments

The Annoying Usefulness of Emacs [video]

https://www.youtube.com/watch?v=DMbrNhx2zWQ
2•znpy•35m ago•1 comments

I bought a pile of dead POS terminals to hunt Windows drivers for CVEs

https://neurowinter.com/security/2025/12/15/The-Hunt-for-POS-Drivers-Continues-Your-Drivers-Are-i...
2•NeuroWinter•37m ago•0 comments

Tesla throws 'cringe' anti-union concert for Giga Berlin employees ahead of vote

https://electrek.co/2025/12/17/tesla-throws-cringe-anti-union-concert-for-giga-berlin-employees-a...
2•breve•39m ago•0 comments

Show HN: Groceed – The shopping list that adapts to your life

https://groceed.app
1•kroniapp•40m ago•0 comments

Why AI Is Making Custom Software Development More Valuable, Not Less

https://www.appunite.com/blog/why-ai-is-making-custom-software-development-more-valuable
1•Przemo_Appunite•41m ago•0 comments

Samsung's Micro RGB TVs Will Soon Be Reasonably-Sized, Down to 55 Inches

https://news.samsung.com/us/samsung-expands-premium-micro-rgb-tv-lineup-2026-new-sizes-advanced-f...
1•HelloUsername•42m ago•0 comments

How UN falsifies its Gender Development Index

https://socialsommentary.substack.com/p/how-un-falsifies-its-gender-development
2•dandare•42m ago•0 comments

How Did India Conquer Space?

https://altermag.com/articles/how-did-india-conquer-space
1•anshulbhide•43m ago•0 comments

BertUI–A React SSG built on Bun that compiles in 38ms (Faster than Vite?)

https://github.com/BunElysiaReact/BERTUI
1•PeaseErnest•46m ago•1 comments

The EV leapfrog – how emerging markets are driving a global EV boom

https://ember-energy.org/latest-insights/the-ev-leapfrog-how-emerging-markets-are-driving-a-globa...
2•breve•49m ago•0 comments

Apple announces App Store and iPhone changes in Japan

https://9to5mac.com/2025/12/17/apple-announces-sweeping-app-store-and-iphone-changes-in-japan/
2•Sudachidev•54m ago•1 comments

Swarms are coming to Claude Code, and how I know

https://twitter.com/entrecurious/status/2001561446308610302
1•connorturland•54m ago•0 comments

GitHub Actions for Self-Hosted Runners Price Increase Postponed

https://pricetimeline.com/news/189
3•taubek•59m ago•0 comments

RCE via ND6 Router Advertisements in FreeBSD

https://www.freebsd.org/security/advisories/FreeBSD-SA-25:12.rtsold.asc
15•weeha•1h ago•13 comments

BoltCache

https://github.com/wutlu/boltcache
2•spotlayn•1h ago•0 comments

Ariane 6 launches Galileo navigation satellites

https://spacenews.com/ariane-6-launches-galileo-navigation-satellites/
4•Harvesterify•1h ago•0 comments