frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Ask HN: Forced into Gemini on Google Account?

1•hysan•6m ago•1 comments

Ask HN: How do you assess and recruit developers in 2026?

1•andrewstuart•9m ago•0 comments

I vibecoded my dream game, GeoGuesser for guns, and its making money

https://gunguesser.com
2•salad_vr•16m ago•0 comments

Coding plan pricing comparisons based on actual usage

https://sites.diy/blog/2026-05-01-coding-plan-comparisons/
2•bilalba•17m ago•0 comments

An Unreleased Lyme Disease Vaccine Is Sparking False Conspiracy Theories

https://www.motherjones.com/politics/2026/05/lyme-disease-vaccine-conspiracy/
2•pulisse•18m ago•0 comments

Gravity's role in quantum state reduction [pdf]

https://image.sciencenet.cn/olddata/kexue.com.cn/upload/blog/file/2010/8/201081019170575880.pdf
1•__patchbit__•22m ago•0 comments

Internet Classism: How We Know You're Poor and Lame Online

https://braunandbrains.substack.com/p/internet-classism-how-we-know-youre
2•anonymouscaller•29m ago•1 comments

When hard drives were still huge: The Quantum Bigfoot turns 30

https://www.heise.de/en/news/When-hard-drives-were-still-huge-The-Quantum-Bigfoot-turns-30-112791...
1•croes•32m ago•1 comments

Tesla reveals $573M web of transactions between Elon Musk's companies

https://electrek.co/2026/05/01/tesla-tsla-web-transactions-musk-companies-spacex-xai-10ka-2025/
3•breve•33m ago•0 comments

Fixing Up CopyFail on Alpine

https://astr.al/notes/2026-04-29_copyfail/
1•potus_kushner•37m ago•0 comments

I built a way to search for people by intent

https://www.try-sytra.com/
2•bryzgalov•37m ago•1 comments

Capacity Efficiency at Meta

https://engineering.fb.com/2026/04/16/developer-tools/capacity-efficiency-at-meta-how-unified-ai-...
1•geoffbp•38m ago•0 comments

Sun Pharma is marrying Organon

https://finshots.in/markets/sun-pharma-is-marrying-organon/
1•vismit2000•39m ago•0 comments

Withastro/flue: The sandbox agent framework

https://github.com/withastro/flue
1•ankitg12•39m ago•0 comments

Sam Altman says OpenAI doesn't want to replace you with AI

https://www.neowin.net/news/sam-altman-says-that-openai-doesnt-want-to-replace-you-with-ai/
1•bundie•40m ago•1 comments

Amnitex: Lossless memory layer for AI coding assistants

https://github.com/Amnibro/amnitex
2•amnibro7•43m ago•0 comments

Spring House

https://en.wikipedia.org/wiki/Spring_house
3•thunderbong•43m ago•0 comments

Create an MP4 video of a web page scrolling at a steady speed

https://github.com/upenn/web-scroll-video
2•shawnzam•48m ago•1 comments

Show HN: Local Lock Down Lobe Chat Setup

2•I_like_tomato•48m ago•0 comments

Governor – a Claude Code plugin to reduce token/context waste

https://github.com/0xhimanshu/governor
3•mantiscore•54m ago•1 comments

I built the Playwright for desktop apps. 80% token savings

https://github.com/lahfir/agent-desktop
3•lahfir•56m ago•0 comments

Flexible OLED NUSA Infiltrator Jacket (Cyberpunk 2077 Cosplay) [video]

https://www.youtube.com/shorts/0cv3ZvFkxwU
1•starkparker•57m ago•0 comments

Welcome to Actual Computer

https://actual.inc/company/blog/introducing-actual-computer
2•ray__•1h ago•0 comments

Hermit – uniform tooling for Linux and Mac

https://github.com/cashapp/hermit
2•zikani_03•1h ago•1 comments

Oak trees outwit their predators

https://phys.org/news/2026-04-oak-trees-outwit-predators.html
1•wglb•1h ago•1 comments

Wine 11.8 Improves VBScript Compatibility Fixes Microsoft Golf 1999

https://www.phoronix.com/news/Wine-11.8-Released
2•Bender•1h ago•0 comments

Job Postings for Software Engineers Are Rapidly Rising

https://www.citadelsecurities.com/news-and-insights/2026-global-intelligence-crisis/
28•delichon•1h ago•5 comments

Why Footbinding Persisted in China for a Millennium (2015)

https://www.smithsonianmag.com/history/why-footbinding-persisted-china-millennium-180953971/
1•thomassmith65•1h ago•0 comments

Robot fighting league has new home in S.F. you can watch – and buy – 'humanoids'

https://www.sfchronicle.com/tech/article/buy-fighting-robots-sf-22231111.php
1•iancmceachern•1h ago•0 comments

OpenX32: Open Linux kernel for Behringer X32 mixer

https://github.com/OpenMixerProject/OpenX32
3•brudgers•1h ago•0 comments