frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Has the answer to life's origins been hiding in our cells all along?

https://www.newscientist.com/article/2529162-has-the-answer-to-lifes-origins-been-hiding-in-our-c...
1•DonaldFisk•33s ago•0 comments

Joshua Baer, godfather of Austin's startup scene, dies in plane crash

https://www.kxan.com/news/local/austin/joshua-baer-godfather-of-austins-startup-scene-dies-in-pla...
1•csbrooks•3m ago•0 comments

Wang/Blob Tilesets

https://www.boristhebrave.com/permanent/24/06/cr31/stagecast/wang/blob_g.html
1•dested•4m ago•0 comments

Can't get new PlanetScale databases billed to my Cloudflare account

https://blog.cloudflare.com/deploy-planetscale-postgres-with-workers/
1•peterboro•4m ago•0 comments

DuckDuckGo browser blocks ads on YouTube

https://duckduckgo.com/duckduckgo-help-pages/privacy/detecting-ad-blocking-interference-anonymously
3•HelloUsername•5m ago•0 comments

Why servers running Ubuntu can stall on boot for two minutes

https://utcc.utoronto.ca/~cks/space/blog/linux/UbuntuNetplanWaitForCarrier
2•speckx•6m ago•0 comments

The Color Strike

https://ironicsans.ghost.io/the-color-strike/
1•caminanteblanco•6m ago•0 comments

When AI Is Your Pastor: Benchmark for Theological Triage and Pastoral Guidance

https://fideai.org/research/fmg-bench/
1•alexchaomander•7m ago•1 comments

This isn't a post about eating meat

https://jacobian.org/2026/jun/16/not-about-eating-meat/
1•svxml•8m ago•0 comments

CEOs of Anthropic and Google DeepMind call for US-led AI coalition in G7 meeting

https://www.cnbc.com/2026/06/17/anthropic-amodei-google-hassabis-us-ai-coalition-g7.html
2•thm•9m ago•0 comments

Dog Online – A Browser-based MMO for the indie web

https://dogonline.net
1•mimimerlot•9m ago•0 comments

Atomic Arch: Technical Summary of This Arch User Repository Compromise

https://www.linuxtricks.fr/news/10-logiciels-libres/605-atomic-arch-synthese-technique-sur-cette-...
1•daesorin•9m ago•0 comments

Ask HN: At what point does AI regulation lead to confiscation of compute?

1•thoughtpeddler•10m ago•0 comments

How to Stop Babysitting AI Code

https://rohangandhi.com/posts/how-to-stop-babysitting-ai-code/
1•mgamma•10m ago•0 comments

Real-time monitoring of chatbots and agents for AI compliance and governance

https://splabs.io/compliance
1•k-thimmaraju•10m ago•0 comments

Eve: Vercel's Framework for Building Agents

https://vercel.com/eve
1•CharlesW•11m ago•0 comments

UK government built an AI tool to digitise historic planning records

1•brokebroadbeat•12m ago•0 comments

NetBSD 11.0 RC5 Available

https://blog.netbsd.org/tnf/entry/netbsd_11_0_rc5_available
1•bradley_taunt•13m ago•0 comments

Show HN: Yomi – Read any web page, or a whole website, into clean Markdown

https://github.com/tamnd/yomi
3•tamnd•14m ago•2 comments

Neon Testing now supports Bun Test

https://github.com/starmode-base/neon-testing/releases/tag/v3.0.0
1•lirbank•14m ago•0 comments

The Sovereign User: Trading Technological Victimhood for Personal Agency

https://petrapalusova.com/articles/the-sovereign-user-trading-technological-victimhood-for-person...
2•speckx•16m ago•0 comments

The frustration of agreeing with everyone about AI

https://www.flourish.org/2026/06/agree-everyone-ai/
1•frabcus•17m ago•0 comments

Show HN: Tamper evident audit logs for LangGraph/CrewAI agents

https://github.com/Providex-AI/rootsign
1•oabolade•17m ago•0 comments

Hacking the atmosphere: Geoengineering gets a reality check

https://www.technologyreview.com/2026/06/17/1138743/hacking-atmosphere-geoengineering-reality-check/
1•Brajeshwar•20m ago•0 comments

Show HN: A locally hosted version of Google's internal SnipIt tool

https://github.com/ralphite/panda
1•yadongwen•21m ago•0 comments

The Slate Truck's price may have leaked, starts at $24,950

https://arstechnica.com/cars/2026/06/the-slate-trucks-price-may-have-leaked-starts-at-24950/
2•canucker2016•21m ago•1 comments

Austin, TX Incubator Capitol Factory founder Joshua Baer dies in a plane crash

https://www.statesman.com/news/article/laredo-plane-crash-loop-20-austin-bound-22308875.php
2•holler•22m ago•0 comments

Bird jumps 20%+ after shoemaker Allbirds changes name to Smartbird for AI pivot

https://ir.smartbird.ai/news-releases/news-release-details/smartbird-appoints-new-ceo-advance-ai-...
2•thoughtpeddler•23m ago•0 comments

Summoning the Demon

https://geohot.github.io//blog/jekyll/update/2026/06/17/summoning-the-demon.html
1•therepanic•23m ago•0 comments

The technical debt of agentic engineering

https://newsletter.port.io/p/the-hidden-technical-debt-of-agentic
1•krakenwake•23m ago•0 comments