frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Ozone depletion began decades before discovery of ozone hole

https://news.mit.edu/2026/scientists-find-ozone-depletion-began-decades-before-ozone-hole-discove...
1•gmays•12s ago•0 comments

What's holding up the rollout of persistent domain validation for ACME?

https://www.turbolightsolutions.com/posts/dns-persist-01-rollout-blocked-by-security-issue/
1•keydown•2m ago•0 comments

The Origin of Tweet (2013)

https://furbo.org/2013/06/28/the-origin-of-tweet/
1•downbad_•3m ago•0 comments

More Americans Are Installing Residential Battery Storage

https://www.bloomberg.com/news/articles/2026-07-01/us-home-battery-installations-boosted-by-state...
1•toomuchtodo•3m ago•1 comments

Kim Dotcom Loses Court of Appeal Bid to Block Extradition to the U.S.

https://torrentfreak.com/kim-dotcom-loses-court-of-appeal-bid-to-block-extradition-to-the-u-s/
1•Brajeshwar•3m ago•0 comments

Optimization tales with CockroachDB: the slow logout

https://gaultier.github.io/blog/optimization-tales-cockroachdb-part2-slow-logout.html
1•broken_broken_•4m ago•0 comments

In Praise of Observational Evidence

https://asteriskmag.com/issues/14/in-praise-of-observational-evidence
1•fi-le•5m ago•0 comments

A small island in Estonia negotiated special rights

https://news.err.ee/1610067196/ruhnu-residents-sought-to-join-sweden-after-estonia-regained-indep...
1•NalNezumi•5m ago•0 comments

Why changing your productivity system is good

https://birchtree.me/blog/why-changing-your-productivity-system-is-good-actually/
1•surprisetalk•5m ago•0 comments

How We Made IPFS Content Publishing 10x Faster

https://probelab.io/blog/optimistic-provide/
1•dennis-tra•5m ago•0 comments

AI-native workflows have a moat problem

https://ai.gopubby.com/ai-native-workflows-have-a-moat-problem-49992bcc3088
1•oddish-tv•6m ago•0 comments

Show HN: LiveComment "Who Is Hiring?" Plugin

1•ellis0n•6m ago•0 comments

The Annotated Triple Product Property Matrix Multiplication Algorithm

https://leetarxiv.substack.com/p/triple-product-property-matrix-multiplication
1•theanonymousone•7m ago•0 comments

Who Thrives Using AI

https://www.theatlantic.com/ideas/2026/06/ai-open-ai-anthropic/687689/
1•sanj•7m ago•0 comments

The Stockholm Telephone Tower with Approximately 5,500 Telephone Lines, 1890

https://rarehistoricalphotos.com/the-stockholm-telephone-tower-1890/
1•thunderbong•8m ago•0 comments

Welcome to the Dual State of AI Regulation

https://www.thefunsinthefight.com/p/welcome-to-the-dual-state-of-ai-regulation
2•m-hodges•9m ago•0 comments

Multiple $20 AI Plans Are Better Than a Single $100 AI Plan

https://abishekmuthian.com/multiple-20-ai-plans-are-better-than-a-single-100-ai-plan/
1•Abishek_Muthian•9m ago•0 comments

Show HN: Better Version of Bitchat

https://github.com/goldenwebb/bitchatX21
1•ellis0n•9m ago•0 comments

After AI, This Chinese Director Works Three Times Harder and Earns 50% Less [video]

https://www.youtube.com/watch?v=uWSB7s_DQHw
1•mgh2•9m ago•0 comments

Show HN: Places - Google Docs for maps with auto-import from articles and videos

https://www.places.is/
1•jaflo•11m ago•0 comments

Show HN: Noteika – Local-first notes that resurface before duplicate yourself

https://noteika.com
1•annrap1d•12m ago•0 comments

Spanish government 'quietly bans use of Palantir' in critical state systems

https://www.lbc.co.uk/article/spanish-bans-palantir-national-security-5HjdcNp_2/
2•donpott•12m ago•0 comments

Show HN: Md-tmpl - Strongly typed Markdown templates

https://github.com/domenukk/md-tmpl
1•domenukk•15m ago•0 comments

Crazy idea? aiCompiler – write intent in Markdown, LLM executes it as a runtime

https://aicompiler.dev
2•srobbani•15m ago•0 comments

Build Professional Shopify Popups with Popup Conversion Wizard

https://apps.shopify.com/orange-popup
1•Vectortech•15m ago•0 comments

Kunal Shah: The Indian entrepreneur taking charge of WhatsApp

https://www.bbc.com/news/articles/c0my4n38myjo
1•tartoran•17m ago•0 comments

The Orbital Data Center Hype Machine Is in Orbit

https://spectrum.ieee.org/orbital-data-center-hype
1•rndsignals•18m ago•0 comments

Show HN: Golf Swing Camera App

https://timleland.com/introducing-golf-swing-camera/
1•TimLeland•18m ago•0 comments

Show HN: FingerTrip – Stop Hunting on Maps. Just Go

https://www.thefingertrip.com/
1•benevioling•19m ago•0 comments

Futo Notes

https://notes.futo.tech/
1•reader9274•19m ago•0 comments