frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•11mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

The Cost of Closing the Strait of Hormuz

https://www.kielinstitut.de/publications/the-cost-of-closing-the-strait-of-hormuz-energy-bottlene...
1•salkahfi•5m ago•0 comments

Show HN: GolfStudent v2 - 24M-param LLM in 15MB using GPTQ-lite + Muon

https://github.com/openai/parameter-golf/pull/604
1•whitestone1121•8m ago•0 comments

VitruvianOS – Desktop Linux Inspired by the BeOS

https://v-os.dev
1•felixding•8m ago•0 comments

How China Made Itself Tariff-Proof

https://www.nytimes.com/2026/03/24/podcasts/the-daily/china-tariff-robot-export.html
2•mizzao•14m ago•1 comments

My Prodigal Brainchild

https://nealstephenson.substack.com/p/my-prodigal-brainchild
1•nickthegreek•14m ago•0 comments

Show HN: AISH, a shell with natural-language ops workflows

https://github.com/AI-Shell-Team/aish
1•GeekUses9527•18m ago•0 comments

AegisFlow – Open-source AI gateway with policy engine, built in Go

https://github.com/saivedant169/AegisFlow
1•saivedant1011•21m ago•0 comments

6 Active AI Training Projects on Alignerr Right Now [March 2026]

https://aitrainer.work/guides/alignerr-active-projects-march-2026
2•celadondev•25m ago•0 comments

From Zip to Nought: The Rise and Fall of Iomega

https://hackaday.com/2026/03/24/from-zip-to-nought-the-rise-and-fall-of-iomega/
1•lxm•28m ago•0 comments

Ask HN: What shell/terminal setup would you recommend to beginners today?

1•GeekUses9527•30m ago•0 comments

My Stay at the Biltmore Mayfair

https://substack.com/profile/484399211-lolipop/note/c-232884584
1•Btmviolet123•31m ago•0 comments

Url to Video – Transform Product URLs into High-Converting AI Video Ads

https://urltovideo.ai
2•Lisheng•34m ago•0 comments

Primary School Children Face Having to Work Until They Are 75

https://www.thetimes.com/money/pensions/article/primary-school-children-work-until-75-retirement-...
4•karakoram•36m ago•2 comments

Fantastic Anime Edit

https://www.youtube.com/watch?v=mEyrsy6PWgo
1•nivethan•38m ago•0 comments

Show HN: I built a team of AI executives to build my startup – I fired one

https://www.agentmadness.ai/entries/mise-inc
1•jonflaig13•43m ago•0 comments

Why Microsoft and OpenAI are at odds

https://finshots.in/archive/why-microsoft-and-openai-are-at-odds/
1•vismit2000•46m ago•0 comments

Nginx ingress controller has been archived

https://github.com/kubernetes/ingress-NGINX
1•Doublon•47m ago•0 comments

New Mexico just handed Meta its first courtroom defeat over child safety

https://techcrunch.com/2026/03/24/new-mexico-just-handed-meta-its-first-courtroom-defeat-over-chi...
1•pseudolus•48m ago•1 comments

Why Speech Has Never Become Context

https://zhenthinks.substack.com/p/why-speech-has-never-become-context
1•zhenthinks•51m ago•0 comments

The Ocean of Numbers: How India Shaped the Way We Calculate [video]

https://www.youtube.com/watch?v=sa2kN-li984
1•vismit2000•51m ago•0 comments

Data Transformation in the Browser

https://www.smartquerytools.com/
1•dango2506•52m ago•0 comments

StationeryObject

https://stationeryobject.com/archive/
1•NaOH•52m ago•0 comments

Levine Links: Money Stuff but enhanced with O paywalls

https://levine.yet-to-be.com/
1•wyxuan•53m ago•1 comments

LiteLLM got supply-chain attacked: 97M downloads/month, credential stealer

https://hexaclaw.com/blog/litellm-supply-chain-attack
1•hexaclawdevs•55m ago•1 comments

OpenAI killed Sora. Here are 11 video APIs that still work, with pricing

https://hexaclaw.com/blog/sora-is-dead-video-alternatives
1•hexaclawdevs•57m ago•1 comments

DOSS $55M Series B

https://www.doss.com/news/doss-raises-55m-series-b
1•doppp•1h ago•0 comments

So far AI has mostly proven the inherent worth of menial laborers

https://www.theverge.com/ai-artificial-intelligence/899086/jensen-huang-nvidia-agi
4•wrqvrwvq•1h ago•2 comments

Pidgin HTML Markup for Writing, or How Much Can HTML Sustain?

https://aartaka.me/pidgin.html
1•pabs3•1h ago•0 comments

Singapore's Sound Card Hero [video]

https://www.youtube.com/watch?v=VTPa6wRECw0
1•arbayi•1h ago•0 comments

Show HN: JaaS platform: AI panhandler charges $1 minimum per joke

https://doyouhaveadollar.com
1•flarite•1h ago•0 comments