frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•11mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Pentagon Seeks Help from Ford and G.M

https://www.nytimes.com/2026/04/16/business/pentagon-ford-general-motors-defense-production.html
1•geox•34s ago•0 comments

3D-Printed Homes, an Abandoned $590k Deposit, the FBI

https://www.propublica.org/article/3d-printed-affordable-housing-cairo-illinois-prestige
1•petethomas•2m ago•0 comments

Australia's Fiscal Point of No Return

https://caseyhandmer.wordpress.com/2026/04/16/australia-will-run-an-overt-command-economy-by-2040/
1•MrBuddyCasino•8m ago•0 comments

AI boom is city's weirdest tech boom, says S.F.'s chief economist

https://missionlocal.org/2026/04/ai-boom-controller-economist-egan-wagner/
2•littlexsparkee•10m ago•0 comments

Engineer open-sources radar system that's 95% cheaper than $250k offerings

https://www.tomshardware.com/maker-stem/open-source-radar-system-is-95-percent-cheaper-than-usd25...
1•Element_•20m ago•0 comments

Running Your Own AS: Direct Hetzner Peering

https://blog.hofstede.it/running-your-own-as-direct-hetzner-peering-a-fourth-edge-and-bringing-th...
1•319•21m ago•0 comments

Taste.md

https://pablostanley.substack.com/p/tastemd
2•cspags•22m ago•0 comments

FCC exempts Netgear from ban on foreign routers, doesn't explain why

https://arstechnica.com/tech-policy/2026/04/fcc-exempts-netgear-from-ban-on-foreign-routers-doesn...
7•rawgabbit•37m ago•1 comments

The Iranian Teens Behind Lego Trump [video]

https://www.youtube.com/watch?v=SQfI9NTtDE4
3•abetusk•37m ago•0 comments

Iran's Lego Slopaganda Creator [video]

https://www.youtube.com/watch?v=i5Q_v370OJg
3•abetusk•39m ago•1 comments

Flowsta Sign It

https://flowsta.com/sign-it/
1•solarpunked•42m ago•0 comments

Long-term adaptation pathways for Venice and its lagoon under sea-level rise [pdf]

https://www.nature.com/articles/s41598-026-39108-z
3•thunderbong•47m ago•0 comments

Billionaire Andrew Forrest takes Meta to court over scam ads using his likeness

https://www.abc.net.au/news/2026-04-17/andrew-forrest-battles-meta-over-fake-ads/106574806
2•ahonhn•50m ago•0 comments

Bluesky has been dealing with a DDoS attack for nearly a full day

https://www.theverge.com/tech/913638/bluesky-has-been-dealing-with-a-ddos-attack-for-nearly-a-ful...
9•dotmanish•51m ago•1 comments

I made an 80B local model ship a 295-test RAG codebas

https://github.com/Taaar1k/rag-workshop
1•taaarik•52m ago•0 comments

Human Accelerated Region 1

https://en.wikipedia.org/wiki/Human_accelerated_region_1
2•apollinaire•55m ago•0 comments

Why MicroVMs: The Architecture Behind Docker Sandboxes

https://www.docker.com/blog/why-microvms-the-architecture-behind-docker-sandboxes/
2•chmaynard•59m ago•0 comments

Poisoning AI Training Data

https://www.schneier.com/blog/archives/2026/02/poisoning-ai-training-data.html
1•RyanShook•1h ago•0 comments

Android users eligible for payout as part of $135M settlement

https://abc7.com/post/android-users-eligible-payout-part-135-million-settlement/18891777/
1•OutOfHere•1h ago•0 comments

Probabilistic engineering and the 24-7 employee

https://www.timdavis.com/blog/probabilistic-engineering-and-the-24-7-employee
3•beau•1h ago•0 comments

Discourse Is Not Going Closed Source

https://blog.discourse.org/2026/04/discourse-is-not-going-closed-source/
34•sams99•1h ago•11 comments

Taiwan Market Cap Tops $4T on AI Boom, Overtaking UK

https://www.bloomberg.com/news/articles/2026-04-16/ai-driven-demand-pushes-taiwan-s-market-cap-ah...
2•ipnon•1h ago•0 comments

You Are What You Consume

https://www.noahpinion.blog/p/you-are-what-you-consume
2•krustyburger•1h ago•1 comments

Show HN: Ask your AI to start a business for you, resolved.sh

https://resolved.sh/
1•RancheroBeans•1h ago•0 comments

Solving Physics Olympiad via reinforcement learning on physics simulators

https://sim2reason.github.io/
2•ivansavz•1h ago•0 comments

Aurora

https://www.together.ai/blog/aurora
1•gmays•1h ago•0 comments

Observational constraints project a ~50% AMOC weakening by the end of century

https://www.science.org/doi/10.1126/sciadv.adx4298
2•ianrahman•1h ago•0 comments

Axol: Cheerful desktop companion that surfaces alerts from JSON payloads

https://roach.github.io/axol/
3•markchristian•1h ago•0 comments

How are you handling silent failures in multi-step agent workflows?

https://www.agentsentinelai.com/
1•skhatter•1h ago•1 comments

Anthropic in talks to give US Government access to its Mythos model

https://www.ft.com/content/c9f5b690-a10e-4c66-9245-017f8bfbc7b4
4•Cider9986•1h ago•2 comments