frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•11mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

We Are All Constantly Mutating – and That's a Good Thing

https://www.newyorker.com/magazine/2026/04/13/beyond-inheritance-roxanne-khamsi-book-review
1•mitchbob•3m ago•1 comments

More than 12 tons of KitKat's 'new chocolate range' stolen in Italy

https://www.theguardian.com/world/2026/mar/28/kitkat-stolen-italy-f1-bar
1•wslh•3m ago•0 comments

Artemis II Astronauts Back in Houston, Reunite with Families

https://www.nasa.gov/blogs/missions/2026/04/11/artemis-ii-astronauts-back-in-houston-reunite-with...
1•salkahfi•11m ago•0 comments

Curated Costa Rica: The Best Tailor-Made Experiences for Every Type of Traveller

https://johnquam.substack.com/p/curated-costa-rica-the-best-tailor
1•headmonkey•12m ago•0 comments

Unified Perception Engine: never render more than humans can see (public domain)

https://github.com/warofwar2011-dev/unified-perception-engine
1•Mars2011•15m ago•0 comments

The Enigma of Gertrude Stein

https://www.thenation.com/article/culture/gertrude-stein-afterlife-wade-review/
1•samclemens•23m ago•0 comments

Midnight Captain – A midnight commander inspired file manager

https://github.com/duguyue100/midnight-captain
2•duguyue100•25m ago•1 comments

Hackers meet match: New DNA encryption protects engineered cells from within

https://phys.org/news/2026-04-hackers-dna-encryption-cells.html
1•pseudolus•29m ago•0 comments

High-Level Rust: Getting 80% of the Benefits with 20% of the Pain

https://hamy.xyz/blog/2026-01_high-level-rust
2•maxloh•32m ago•0 comments

Show HN: I benchmarked MCP vs. CLI for browser automation. MCP wins by 25x

https://github.com/HKUDS/CLI-Anything/pull/212
1•Achiyacohen•33m ago•1 comments

"MongoDB is web scale" (Throwback 2010 XtraNormal clip) [video]

https://www.youtube.com/watch?v=b2F-DItXtZs
2•chirau•33m ago•1 comments

Aero and Y2K Webring

https://frutigeraeroarchive.org/aero_webring
2•jack-bodine•35m ago•0 comments

YuanLey YS100-0602T Review a Cheap 8-Port 10GbE Switch

https://www.servethehome.com/yuanley-ys100-0602t-review-a-cheap-8-port-10gbe-switch/
2•teleforce•39m ago•0 comments

Building a Homebrew Computer Like it's 1995 [video]

https://www.youtube.com/watch?v=FVH6_0GlLNc
1•st_goliath•39m ago•0 comments

He Helped Stop Iran from Getting the Bomb

https://www.newyorker.com/magazine/2026/04/06/he-helped-stop-iran-from-getting-the-bomb
2•posthumangr•41m ago•0 comments

USB/IP Project: a general USB device sharing system over IP network

https://usbip.sourceforge.net/
2•ValentineC•42m ago•0 comments

PBS Nova: Terror in Space (1998)

https://www.pbs.org/wgbh/nova/mir/
1•opengrass•45m ago•0 comments

Show HN: I visualized Wasteland as an RTS game

https://gascraft.ai
2•dnewcome•47m ago•0 comments

React-Debug-Updates

https://github.com/pie6k/react-debug-updates
1•handfuloflight•53m ago•0 comments

No Acquittal for Storm Today

https://www.therage.co/roman-storm-acquittal-2/
2•Cider9986•1h ago•1 comments

Hungary 2026 – Chat with 1k AI-simulated voters before the election

https://hungary2026.populon.ai
1•Mert_Predicts•1h ago•0 comments

Hero rat who sniffed out over 100 land mines is honored with giant statue

https://www.washingtonpost.com/lifestyle/2026/04/08/rat-cambodia-statue-land-mines-magawa/
3•paulpauper•1h ago•1 comments

A Brief History of Lab Notebooks

https://www.asimov.press/p/lab-notebooks
6•paulpauper•1h ago•0 comments

Apple Sued by Three YouTube Channels

https://www.macrumors.com/2026/04/06/apple-sued-by-three-youtube-channels/
6•gnabgib•1h ago•3 comments

Forecasting the economic effects of AI

https://forecastingresearch.substack.com/p/forecasting-the-economic-effects-of-ai
2•hhs•1h ago•0 comments

New WHO database helps countries turn health data into better policy

https://www.who.int/europe/news/item/08-04-2026-new-who-database-helps-countries-turn-health-data...
2•gnabgib•1h ago•0 comments

Predict-Rlm: The LLM Runtime That Lets Models Write Their Own Control Flow

https://repo-explainer.com/Trampoline-AI/predict-rlm
1•handfuloflight•1h ago•0 comments

Ask HN: How to have a macOS devcontainer in VS Code?

2•sroussey•1h ago•0 comments

The Exception Butterfly

https://valhovey.github.io/blog/the-exception-butterfly
2•speleo•1h ago•0 comments

Waiting for Postgres 19: Reduced Timing Overhead for EXPLAIN ANALYZE with RDTSC

https://pganalyze.com/blog/5mins-postgres-19-reduced-timing-overhead-explain-analyze
1•lfittl•1h ago•0 comments