frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Today we reduced headcount by 22% / The goal is 100x output

https://twitter.com/DJ_CURFEW/status/2057522382315929802
1•adrianmsmith•2m ago•0 comments

Google makes Gemini 3.5 Flash the default AI model for billions of users

https://techthreedots.com/google-makes-gemini-3-5-flash-the-default-ai-model-for-billions-of-users
1•perbit•3m ago•0 comments

UK supermarket using GPS trackers on £3.90 sausages to crackdown on thefts

https://www.mirror.co.uk/money/uk-supermarket-using-gps-trackers-37190883
1•austinallegro•3m ago•0 comments

I built a Bitcoin lottery machine [video]

https://www.youtube.com/watch?v=2UM4j1_xEs0
1•tzvc•3m ago•0 comments

AI didn't kill your junior pipeline. You did

https://andrewmurphy.io/blog/ai-didnt-kill-your-junior-pipeline-you-did
1•ColinEberhardt•12m ago•0 comments

The Trump phone looks suspiciously like an HTC u24 Pro

https://www.ifixit.com/News/117461/the-trump-phone-looks-suspiciously-like-a-htc-u24-pro
1•barryvan•12m ago•0 comments

Highest Random Weight in Elixir

https://jola.dev/posts/highest-random-weight-in-elixir
1•signa11•13m ago•0 comments

A/B tested Gemini 3.1 Pro vs. Claude Opus 4.6 – usage quota and quality

https://www.reddit.com/r/Bard/s/4SvwzqPdKm
1•Michelangelo11•13m ago•0 comments

KMRI – experimental chunked MRI compression using ZSTD and ROI-aware encoding

https://github.com/Kiamehr5/KMRI
1•kiamehr•15m ago•0 comments

Microsoft reportedly cancelling Claude internally due to cost

http://www.thelowdownblog.com/2026/05/microsoft-cancels-internal-anthropic.html
3•TeriyakiBomb•18m ago•1 comments

Europe regulated itself into American vassalage

https://www.economist.com/europe/2026/04/22/how-europe-regulated-itself-into-american-vassalage
2•alephnerd•19m ago•0 comments

Adobe, Canva, CapCut Are Coming to Gemini to Help You Edit AI Creations

https://www.pcmag.com/news/adobe-canva-capcut-are-coming-to-gemini-to-help-you-edit-ai-creations
1•Michelangelo11•21m ago•0 comments

NVCF Is Now Open Source: Inside Nvidia's GPU Function Platform

https://blog.kubesimplify.com/nvcf-is-now-open-source-inside-nvidia-s-gpu-function-platform
1•mastabadtomm•22m ago•0 comments

The Climate Crisis: Illusion of Action in the Age of Green Capitalism

https://borisljevar.substack.com/p/the-climate-crisis-illusion-of-action
1•inferiordev•28m ago•1 comments

"Erase," an AI tool that can remove unwanted objects from images

https://flux-tools.bfl.ai/erase
1•sofumel•30m ago•0 comments

How to Speed Up Phrase Search with Bigram_index

https://medium.com/@s_nikolaev/how-to-speed-up-phrase-search-with-bigram-index-959d44fb4e48
1•snikolaev•33m ago•0 comments

Steve Wozniak cheered after telling students they have AI – actual intelligence

https://www.businessinsider.com/steve-wozniak-apple-ai-graduation-speech-2026-5
7•signa11•35m ago•2 comments

Series finale of Stephen Colberts Late show

https://www.youtube.com/playlist?list=PLiZxWe0ejyv8KfXDnd023vRcF8W8_FbDm
1•stop50•38m ago•0 comments

AI-Assisted Engineering Habits Worth Stealing (Week 2 Roundup)

https://theaileverageweekly.com/posts/7-ai-assisted-engineering-habits-worth-stealing-week-2-roun...
1•talvardi7•39m ago•0 comments

Frustrated Indian youth flock to a political party led by a cockroach

https://apnews.com/article/india-cockroach-janta-party-9e8be82b182e32feda4fee42d52de75b
3•petethomas•40m ago•0 comments

Can Monasteries Be a Model for Reclaiming Tech Culture for Good?

https://www.thenation.com/article/archive/can-monasteries-be-model-reclaiming-tech-culture-good/
1•simonebrunozzi•43m ago•0 comments

China overtakes US to become top foreign investor in Germany

https://www.scmp.com/economy/china-economy/article/3354505/its-firms-look-overseas-china-overtake...
2•theanonymousone•44m ago•1 comments

Suicide tops causes of death among Korean youth for 14th straight year

https://www.koreatimes.co.kr/southkorea/society/20260522/suicide-tops-causes-of-death-among-korea...
2•berlianta•45m ago•0 comments

AI coding agents and the evolution of developer skills by 2026

https://www.hitechies.com/ai-coding-agents-developer-skills-code-review-2026/
1•dhakalster•47m ago•0 comments

Lucy – pay-per-task AI agent in USDC, no subscription (A2A/MCP/x402)

https://github.com/Woodman97/lucy-agent
1•vinny1•57m ago•0 comments

A revolution in mathematics? What happened a century ago and why it matte [pdf]

https://www.ams.org/notices/201201/rtx120100031p.pdf
2•fanf2•58m ago•0 comments

The Internet can't stop watching Figure AI's humanoid robots handling packages

https://arstechnica.com/ai/2026/05/the-internet-cant-stop-watching-figure-ais-humanoid-robots-han...
1•vintagedave•1h ago•1 comments

AI dev tools: Cost, ROI, and budgeting for 2026

https://www.hitechies.com/ai-developer-tools-cost-roi-budget-2026/
2•dhakalster•1h ago•0 comments

24/7 Renewables Are Ending Fossil Fuel Reliability

https://www.forbes.com/sites/kensilverstein/2026/05/20/how-247-renewables-are-ending-fossil-fuel-...
1•xbmcuser•1h ago•0 comments

Only 17% of all 64-bit Integers are products of two 32-bit integers

https://lemire.me/blog/2026/05/22/only-17-of-all-64-bit-integers-are-products-of-two-32-bit-integ...
3•chmaynard•1h ago•1 comments