frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•11mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

JJM Address at the Arab Center

https://mearsheimer.substack.com/p/jjm-address-at-the-arab-center
1•hackandthink•47s ago•0 comments

Show HN: ColorPair – A free color-matching puzzle game for iOS

https://apps.apple.com/us/app/colorpair-puzzle-game/id6761508158
1•nykylomedia•2m ago•0 comments

Isaac Asimov on 1984

https://redsails.org/asimov-on-1984/
1•cybersoyuz•3m ago•0 comments

Why GEO is still kinda dumb

https://toffee.at/blog/why-geo-is-not-enough
1•Vishi_2•4m ago•0 comments

OpenAI is nothing without its people

https://geohot.github.io//blog/jekyll/update/2026/04/11/openai-people.html
1•eamag•5m ago•0 comments

Can AI Generate a Full Unity World from One Prompt? I Tested

https://darkounity.com/blog/can-ai-generate-a-full-unity-world-from-one-prompt-i-tested
2•hacker_13•6m ago•0 comments

NaiBor – Nashville public leadership tracking

https://clovenbradshaw-ctrl.github.io/naibor/
1•samename•7m ago•0 comments

Oil at $115: What a Hormuz Stress Model Shows

https://medium.com/@lightcapai/how-to-track-hormuz-risk-with-mcp-public-artifacts-and-reproducibl...
1•festafin•8m ago•0 comments

Rockstar Games Hacked, Hackers Threaten a Massive Data Leak If Not Paid Ransom

https://kotaku.com/rockstar-games-reportedly-hacked-massive-data-leak-ransom-gta-6-shinyhunters-2...
2•c420•10m ago•0 comments

What's obvious to you might not be to me

https://herbertlui.net/whats-obvious-to-you-might-not-be-to-me/
1•herbertl•10m ago•1 comments

Wheeeee Loop – A Superconductor Used Like a Battery

https://stateofutopia.com/experiments/wheeeeeloop/wheeeeeloop.html
1•logicallee•11m ago•1 comments

Macframe – IBM Mainframe Emulator for macOS

https://github.com/vitorallo/macframe-releases
2•rbanffy•15m ago•0 comments

Code Review Skills from uv, bun, vLLM

https://github.com/dtran24/code-reviewer-personas
2•dtran24•16m ago•0 comments

When Managers Cover Their Posteriors: Making Decisions the Market Wants to See

https://www.jstor.org/stable/2555842
1•rustoo•22m ago•0 comments

MCP Spine – Middleware proxy that cuts LLM tool token usage by 61%

https://github.com/Donnyb369/mcp-spine
3•Mxwell369•24m ago•0 comments

Gallup poll: GenZ AI adoption steady but skepticism on the rise

https://web.archive.org/web/20260410202758/https://news.gallup.com/poll/708224/gen-adoption-stead...
2•1vuio0pswjnm7•30m ago•1 comments

You should be able to SEE what your agents are doing. I created the solution

https://www.tarsy.dev/
3•leddo•31m ago•0 comments

Kids Are Discovering the Joys–and Pains–Of the Landline

https://www.wsj.com/lifestyle/kids-are-discovering-the-joysand-painsof-the-landline-f703d505
2•impish9208•31m ago•2 comments

Show HN: When Clocks Drift Apart

https://animeshchouhan.com/time/
1•animeshchouhan•31m ago•0 comments

We reverse-engineered Claude Code's billing system to fix overage charges

https://github.com/askalf/dario
2•askalf•32m ago•0 comments

Advanced Mac Substitute is an API-level reimplementation of 1980s-era Mac OS

https://www.v68k.org/advanced-mac-substitute/
3•zdw•35m ago•0 comments

What Is Hypernormalization?

https://adbusters.org/articles-coded/what-is-hypernormalization
1•burnt-resistor•36m ago•0 comments

Raising Carthaginian Armies, Part I: Finding Carthaginians

https://acoup.blog/2026/04/10/collections-raising-carthaginian-armies-part-i-finding-carthaginians/
1•gostsamo•36m ago•0 comments

Don't Be Evil

https://pluralistic.net/2026/04/11/obvious-terrible-ideas/
2•hn_acker•36m ago•0 comments

Preparing for My Own Funeral

https://chrisaustem.substack.com/p/preparing-for-my-own-funeral
2•putzdown•41m ago•0 comments

What tools do you use to visualize algorithms?

2•rjn32s•42m ago•2 comments

EU should regulate Big Tech, not banning kids from social media, Estonia says

https://www.politico.eu/article/europe-should-stand-up-to-big-tech-instead-of-imposing-social-med...
3•donohoe•44m ago•1 comments

The Structure of the Puma Computer System [pdf]

https://softwarepreservation.computerhistory.org/SETL/setl/doc/Grishman-Structure_of_Puma-1978.pdf
3•rbanffy•47m ago•0 comments

We replaced user accounts with Lightning payments for identity

https://blog.satsrail.com/payment-as-identity/
3•keymaker_p•48m ago•3 comments

Native Raspberry Pi 3B version of the Oberon System 3

https://github.com/rochus-keller/OberonSystem3Native/releases/tag/2026-04-10
3•HotGarbage•49m ago•0 comments