frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Ask HN: Are recent tech layoffs affecting mostly Sr and Jr devs?

1•bubbamack•1m ago•0 comments

Ask HN: What is your tips or wild usages on gcloud?

1•revv00•2m ago•0 comments

The OSS Sabotage Manual Became Corporate Best Practice

https://www.alephic.com/sabotage
1•cyb0rg0•3m ago•1 comments

ALPR Mission Creep: School Residency, Background Checks, and Noise Complaints

https://www.eff.org/deeplinks/2026/05/more-license-plate-reader-mission-creep-school-residency-ve...
2•hn_acker•7m ago•1 comments

Show HN: Artifold – A local-first library for AI-generated HTML artifacts

https://github.com/shubhamgoel27/artifold
1•shubhamgoel27•9m ago•0 comments

Anthropic Appoints KiYoung Choi as Representative Director of Korea

https://www.anthropic.com/news/kiyoung-choi-representative-director-anthropic-korea
1•surprisetalk•13m ago•0 comments

CodeAtlas – Google Maps for your codebase, with a live differential UML engine

https://www.codeatlas.live/
1•vamsikanneganti•17m ago•0 comments

Ecosystem for (Studying) Network Programming

https://docs.packetcord.io/
1•vmetodiev•20m ago•0 comments

Cloudflare Flagship

https://developers.cloudflare.com/flagship/
3•tjek•22m ago•1 comments

Unit cell designer for 2d wallpaper groups

https://nasqret.github.io/symm/
1•mathgenius•23m ago•0 comments

The true reason C++ always wins [video]

https://www.youtube.com/watch?v=I7fEsbksKRE
1•abhaynayar•24m ago•0 comments

You're about to feel the AI money squeeze

https://www.theverge.com/ai-artificial-intelligence/917380/ai-monetization-anthropic-openai-token...
1•1vuio0pswjnm7•28m ago•0 comments

Qualcomm strikes AI chip deal with TikTok owner ByteDance

https://www.reuters.com/business/qualcomm-strikes-ai-chip-deal-with-tiktok-owner-bytedance-bloomb...
2•Voblit•32m ago•0 comments

Grok Build

https://grok.com/build
2•kristianpaul•36m ago•0 comments

Show HN: I made a simple Keyword Research tool for app devs

https://ezscreenshots.com/aso/
1•abrowniejr•37m ago•0 comments

I Made a Journal for AI-Generated Papers

https://cesarhidalgo.com/blog/2026/5/26/why-i-made-a-journal-for-ai-generated-papers
2•Anon84•39m ago•0 comments

Iran's access to global internet starts to resume after 88-day blackout

https://www.theguardian.com/world/2026/may/26/iran-internet-blackout
4•theali•39m ago•1 comments

A Luxury Survivalist Community Is Tearing Itself Apart

https://www.wsj.com/us-news/a-luxury-survivalist-community-is-tearing-itself-apart-53d2a99f
7•impish9208•40m ago•1 comments

NASA Provides Update on Moon Base Rovers, Landers, Missions

https://www.nasa.gov/news-release/nasa-provides-update-on-moon-base-rovers-landers-missions/
2•Anon84•41m ago•0 comments

Starlink and Amazon may be able to buy into EU mobile satellite spectrum plan

https://www.reuters.com/business/aerospace-defense/european-companies-set-receive-two-thirds-futu...
1•Voblit•42m ago•0 comments

Where are those goalposts? I'm sure I put them here somewhere

https://nickdrozd.github.io/2026/05/26/goalposts-math.html
1•nickdrozd•44m ago•0 comments

The AI Token plumbing issue

https://getlago.com/blog/ai-billing-is-mostly-token-plumbing
1•jdenquin•44m ago•0 comments

Magnifica Humanitas, LaTeX Version

https://github.com/cucho/magnifica-humanitas/blob/master/latex/en.tex
1•cucho•47m ago•0 comments

MySQL faithful launch OurSQL Foundation to keep Oracle honest

https://www.theregister.com/databases/2026/05/26/mysql-faithful-launch-oursql-foundation-to-keep-...
3•thejerz•50m ago•0 comments

Supreme Court rejects Meta's appeal in Vermont social media addiction case

https://apnews.com/article/meta-teens-harms-supreme-court-6a0de777da59575ca0210b275892835a
3•1vuio0pswjnm7•52m ago•0 comments

The AI Bubble

https://nooneshappy.com/article/the-ai-bubble/
3•SupremumLimit•53m ago•0 comments

Xiaomi MiMo-v2.5 price drops 99% – AI pricing war

https://platform.xiaomimimo.com/docs/en-US/welcome
5•mariopt•55m ago•3 comments

Altman and Amodei both walking back on AI jobs apocalypse prophecies before IPOs

https://fortune.com/2026/05/26/sam-altman-dario-amodei-walking-back-ai-jobs-apocalypse-prophecies...
4•1vuio0pswjnm7•56m ago•0 comments

Introduction to Virtio Protocol

https://www.openeuler.org/en/blog/yorifang/virtio-spec-overview.html
1•jakogut•57m ago•0 comments

Modeling Snakes and Ladders: The Board

https://john.senneker.ca/snakes-2/
1•jsenn•58m ago•0 comments