frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•10mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Gpsjam GPS/GNSS Interference Map

https://gpsjam.org/
1•jonbaer•3m ago•0 comments

The Quantum Curtain

https://www.defenseone.com/ideas/2026/03/quantum-curtain/411967/
2•jonbaer•6m ago•0 comments

Stacksort

https://gkoberger.github.io/stacksort/
1•mihau•7m ago•0 comments

Mesh – remote mobile forensics and network monitoring

https://github.com/BARGHEST-ngo/MESH
1•0x0v1•7m ago•1 comments

MacBook Neo Review: Better Than You Think

https://www.youtube.com/watch?v=iGeXGdYE7UE
1•keepamovin•7m ago•0 comments

Encode/httpx: Closing off access

https://github.com/encode/httpx/discussions/3784
2•luismedel•8m ago•0 comments

A Kubernetes operator that orchestrates AI coding agents

https://medium.com/@bobbydeveaux/we-built-an-ai-that-plans-codes-reviews-and-ships-and-then-we-us...
1•bobbydeveaux•9m ago•1 comments

AI Agent Hacks McKinsey

https://codewall.ai/blog/how-we-hacked-mckinseys-ai-platform
1•mycroft_4221•10m ago•0 comments

Movies I Highly Recommend

https://github.com/ojhaugen15/12_movies
1•programmexxx•12m ago•0 comments

Richard Feynman's story illustrating the problem of p-hacking

https://twitter.com/SwipeWright/status/2031604331510690112
4•MrBuddyCasino•20m ago•0 comments

Glanceway – Collect RSS and custom plugin data in your macOS menu bar

https://glanceway.app
1•codytseng•21m ago•1 comments

Unbash: Fast 0-deps bash parser written in TypeScript

https://github.com/webpro-nl/unbash
1•mariuz•22m ago•0 comments

Ask HN: Is there a market for a security-audited Claude Code skills newsletter?

1•camicortazar•23m ago•0 comments

The Anthropic Institute

https://www.anthropic.com/news/the-anthropic-institute
4•meetpateltech•23m ago•1 comments

Gemini 2 Is the Top Model for Embeddings

https://agentset.ai/blog/gemini-2-embedding
2•tifa2up•27m ago•0 comments

Tutorials in Optomechanics

https://wp.optics.arizona.edu/optomech/tutorials-in-optomechanics/
1•o4c•29m ago•0 comments

A.I. Incites a New Wave of Grieving Parents Fighting for Online Safety

https://www.nytimes.com/2026/03/10/technology/ai-social-media-child-safety-parents.html
3•1vuio0pswjnm7•33m ago•1 comments

The Ig Nobel Prize Ceremony Is Moving to Europe (After 35 Years in the USA)

https://improbable.com/2026/03/10/the-ig-nobel-prize-ceremony-is-moving-to-europe-after-35-years-...
3•layer8•36m ago•0 comments

Some Arabic Words Transliterated

https://docs.google.com/document/d/1RMxjUr2Rki6TLNTNd00BNtBUwB0DJXiE4Dd_YppUi1I/edit
1•programmexxx•38m ago•0 comments

Google to Provide Pentagon with AI Agents

https://www.bloomberg.com/news/articles/2026-03-10/google-to-provide-pentagon-with-ai-agents-for-...
9•1vuio0pswjnm7•39m ago•2 comments

Europe tops global arms imports, SIPRI reports

https://www.dw.com/en/sipri-europe-arms-imports-global-weapons-trade-defense-spending/a-76261906
1•breve•43m ago•0 comments

AI-powered apps struggle with long-term retention, new report shows

https://techcrunch.com/2026/03/10/ai-powered-apps-struggle-with-long-term-retention-new-report-sh...
2•pseudolus•47m ago•0 comments

My app got 3k users in 48 hours and then monetization almost killed it

https://getcalendarly.com
1•DimKat•47m ago•1 comments

PEP 827 – Type Manipulation

https://peps.python.org/pep-0827/
2•EvgeniyZh•47m ago•0 comments

NASA's Van Allen Probe A to re-enter atmosphere

https://phys.org/news/2026-03-nasa-van-allen-probe-atmosphere.html
7•bookmtn•48m ago•0 comments

How age standardization make health metrics comparable

https://ourworldindata.org/age-standardization
1•sohkamyung•49m ago•0 comments

Discovering Little Worlds (2020)

https://dmitrybrant.com/2020/08/01/discovering-little-worlds
2•wonger_•50m ago•0 comments

Ukraine Reaches a Milestone: Making ‘China-Free’ Drones

https://www.nytimes.com/2026/03/11/world/europe/ukraine-drones-china.html
2•giuliomagnifico•50m ago•2 comments

Simple-Git NPM package has CVSS 9.8 RCE; 5M+ weekly downloads–check lockfiles

https://www.codeant.ai/security-research/simple-git-remote-code-execution-cve-2026-28292
1•birdculture•53m ago•0 comments

Automatic Pronunciation Error Detection and Correction of the Holy Quran

https://arxiv.org/abs/2509.00094
1•handfuloflight•56m ago•0 comments