frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•11mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Gemini 3.1 Flash Live: Making audio AI more natural and reliable

https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-flash-live/
1•meetpateltech•57s ago•0 comments

Ask HN: Are you more quickly hitting Claude Code limits the past 48-96 hours?

1•throwaway6977•1m ago•0 comments

Kindle update 5.19.2 is the worst Kindle update of all time IMO

1•seam_carver•1m ago•0 comments

There is a closing window to stop driverless cars from creating omnigridlock

https://worksinprogress.co/issue/escaping-the-ogallala-trap/
1•bensouthwood•1m ago•0 comments

Good News: Free Speech Wins Big in Court

https://www.racket.news/p/finally-good-news-free-speech-wins
1•mudil•3m ago•0 comments

AI Won't Automatically Accelerate Clinical Trials

https://www.asimov.press/p/ai-clinical-trials
1•surprisetalk•5m ago•0 comments

Dreaming of a Ten-Year Computer

https://alexwlchan.net/2026/ten-year-computer/
1•surprisetalk•5m ago•0 comments

China Is Not an Expansionist Power

https://zixuanma.blog/p/china-is-not-an-expansionist-power
2•surprisetalk•5m ago•0 comments

Principles and Gear

https://arun.is/blog/on-running/
1•surprisetalk•5m ago•0 comments

Battleship Prompts

https://jonathannen.com/battleship-prompts/
2•jwilliams•7m ago•0 comments

KDE Plasma 6.6 Delivers an Impressive Edge over Gnome 50 on Ubuntu 26.04

https://www.phoronix.com/review/ubuntu-2604-gnome-kde
2•jrepinc•7m ago•0 comments

ClawInstitute

https://clawinstitute.aiscientist.tools
2•Murfalo•8m ago•1 comments

Show HN: Kora – An AI-native OS layer written in 370k lines of Rust

https://intuitivecompute.com
2•jwatters•9m ago•0 comments

Next.js Across Platforms: Adapters, OpenNext, and Our Commitments

https://nextjs.org/blog/nextjs-across-platforms
2•makepanic•9m ago•0 comments

Aerion – An Open Source Lightweight Email Client

https://github.com/hkdb/aerion
2•thdr•9m ago•0 comments

Iran war could crimp Gulf allies' US investments

https://www.politico.com/news/2026/03/26/immensely-destabilizing-iran-war-threatens-gulfs-us-inve...
3•rurp•10m ago•0 comments

The RISE RISC-V Runners: free, native RISC-V CI on GitHub

https://riseproject.dev/2026/03/24/announcing-the-rise-risc-v-runners-free-native-risc-v-ci-on-gi...
2•thebeardisred•10m ago•0 comments

Why aren't we fine-tuning more?

https://www.natemeyvis.com/why-arent-we-fine-tuning-more/
2•gmays•10m ago•0 comments

AMD Announces the Ryzen 9 9950X3D2

https://www.phoronix.com/news/AMD-Ryzen-9-9950X3D2
3•coobird•11m ago•0 comments

Hello Algo

https://www.hello-algo.com/en/
2•ibobev•12m ago•0 comments

Show HN: Wit – Stops merge conflicts when multiple AI agents edit the same repo

https://github.com/amaar-mc/wit
4•amaarc•13m ago•0 comments

ZT Manager – A native iOS app to manage ZeroTier networks

https://testflight.apple.com/join/Xvd715tV
3•Messoris•13m ago•1 comments

Flowers for dry Claude: Memes are better sensors than benchmarks

https://www.nickoak.com/posts/flowers-for-dry-claude/
2•buildoak•14m ago•0 comments

Sorting Algorithms

https://tools.simonwillison.net/sort-algorithms
2•cromulent•18m ago•0 comments

Speaking of Voxtral

https://mistral.ai/news/voxtral-tts
3•Palmik•19m ago•0 comments

Federal government employees are not ok

https://donmoynihan.substack.com/p/federal-employees-are-not-ok
4•NomNew•20m ago•1 comments

FossGIS Videos (mostly in German language)

https://media.ccc.de/c/fossgis2026
2•slow_typist•20m ago•0 comments

Show HN: Search and track flight prices across date and destination combinations

https://butterfly.flights/
4•philjohnson•21m ago•0 comments

Oldest dog identified at ancient hunter-gatherer site

https://www.science.org/content/article/world-s-oldest-dog-identified-ancient-hunter-gatherer-site
2•Brajeshwar•21m ago•0 comments

Show HN: I resurrected my 2013 web usability checklist for the AI age

https://www.userium.com/
2•userium•22m ago•1 comments