frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Etymology of "Foo"

https://www.ietf.org/rfc/rfc3092.txt
1•shawnzam•3m ago•0 comments

Show HN: Peeroxide – Full wire-compatible Rust implementation of Hyperswarm

https://github.com/Rightbracket/peeroxide
1•eshork•4m ago•0 comments

Is the world ready for a car without a rear window?

https://www.msn.com/en-us/autos/enthusiasts/is-the-world-ready-for-a-car-without-a-rear-window/ar...
1•pseudolus•6m ago•0 comments

Law School Applicants Continues to Skyrocket Driving Admission Rates to New Lows

https://www.law.com/2026/04/24/law-school-applicants-continued-to-skyrocket-in-2025-driving-admis...
1•1vuio0pswjnm7•6m ago•0 comments

Chernobyl at 40: Secret Stasi files reveal extent of Soviet misinformation

https://theconversation.com/chernobyl-at-40-secret-stasi-files-reveal-extent-of-soviet-misinforma...
1•1659447091•8m ago•0 comments

New Type of Neuroplasticity Rewires the Brain After a Single Experience

https://www.quantamagazine.org/a-new-type-of-neuroplasticity-rewires-the-brain-after-a-single-exp...
1•pseudolus•9m ago•0 comments

Start Blogging (Even If Nobody Will Read It)

https://nikola-breznjak.com/blog/devthink/why-you-should-start-blogging-even-if-nobody-will-read-it/
1•eigenBasis•13m ago•0 comments

Brazil Passes Landmark Law to Protect Children Online

https://www.hrw.org/news/2025/09/17/brazil-passes-landmark-law-to-protect-children-online
1•mooreds•15m ago•0 comments

GnuPG – post-quantum crypto landing in mainline

https://lists.gnupg.org/pipermail/gnupg-announce/2026q2/000504.html
1•zdkaster•16m ago•1 comments

Wedges and Control Points in Product Strategy

https://edwardhsu.substack.com/p/wedges-control-points-and-the-missing
1•rahimnathwani•24m ago•0 comments

Meetings Are Forcing Functions

https://www.mooreds.com/wordpress/archives/3734
1•zdw•29m ago•0 comments

The death of the American Dream is now official

https://thehill.com/opinion/finance/5846892-american-dream-debt-crisis/
8•Teever•30m ago•2 comments

Tell HN: Medvi (telehealth) hardcodes 999 patient emails in public JavaScript

1•g48ywsJk6w48•33m ago•0 comments

OWASP Top, Vibe Coding, and What Developers Miss with Tanya Janca [video]

https://www.youtube.com/watch?v=LSYkD-MKdmk
1•mooreds•34m ago•0 comments

How Meta used AI to map tribal knowledge in large-scale data pipelines

https://engineering.fb.com/2026/04/06/developer-tools/how-meta-used-ai-to-map-tribal-knowledge-in...
1•theorchid•35m ago•0 comments

Fruit Box

https://en.gamesaien.com/game/fruit_box/
2•downboots•38m ago•0 comments

Reviving Koken

https://www.bradleyboy.com/writings/reviving-koken/
1•bradleyboy•38m ago•0 comments

Discord Sleuths Gained Unauthorized Access to Anthropic's Mythos

https://www.wired.com/story/security-news-this-week-discord-sleuths-gained-unauthorized-access-to...
2•wyldfire•39m ago•0 comments

Starting with "Yes"

https://www.darthealth.com/blog/starting-with-yes
1•mooreds•39m ago•0 comments

The AI Compute Crunch Is Here (and It's Affecting the Economy)

https://www.404media.co/the-ai-compute-crunch-is-here-and-its-affecting-the-entire-economy/
2•gasull•45m ago•0 comments

Ukraine marks 40th anniversary of Chornobyl disaster under cloud of war

https://www.reuters.com/world/europe/ukraine-marks-40th-anniversary-chornobyl-disaster-under-clou...
4•onemoresoop•46m ago•0 comments

Harvard students call grading reform 'racist' in petition

https://www.campusreform.org/article/harvard-students-call-grading-reform-racist-petition/29761
3•ivewonyoung•53m ago•4 comments

LLMs Corrupt Your Documents When You Delegate

https://arxiv.org/abs/2604.15597
3•achrono•55m ago•1 comments

Reviving BrowserID in 2026

https://wakamoleguy.com/p/reviving-browserid-in-2026
3•wakamoleguy•1h ago•0 comments

Show HN: deterministic oracle for hardware designs with replayable proofs

https://suprastructure.net
1•suprastructure•1h ago•0 comments

Show HN: Draw Together Online

https://vidzert.com/draw-together
4•vidzert•1h ago•0 comments

Donald Trump is giving psychedelic medicines a welcome boost

https://www.economist.com/business/2026/04/23/donald-trump-is-giving-psychedelic-medicines-a-welc...
2•andsoitis•1h ago•0 comments

Claude Cowork Now Runs Any LLM. Test It Free

https://www.productcompass.pm/p/cowork-on-3p-any-llm
1•obilgic•1h ago•0 comments

Multi-Agent AI Systems Are Eating Single Agents

https://aistackinsights.ai/blog/multi-agent-ai-systems-langgraph-crewai-production-guide
2•aistackinsights•1h ago•0 comments

A Guide to CubeSat Mission and Bus Design

https://pressbooks-dev.oer.hawaii.edu/epet302/
2•o4c•1h ago•0 comments