frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

The C64 Dead Test Font

https://www.masswerk.at/nowgobang/2026/c64-dead-test-font
3•masswerk•7m ago•0 comments

The Random Camera Shop Discovery That Inspired Star Wars' Lightsaber Design

https://www.bgr.com/2143613/star-wars-lightsaber-design-inspiration-camera-shop-discovery/
1•gnabgib•8m ago•0 comments

All the Star Wars Lightsaber Designs

https://kottke.org/26/05/all-the-star-wars-lightsaber-designs
1•vinhnx•16m ago•0 comments

How a fake investigation into India, Myanmar rebels went around the world

https://www.altnews.in/the-conspiracy-that-wasnt-how-a-fake-investigation-into-india-myanmar-rebe...
1•thunderbong•17m ago•0 comments

AI as a Design Medium

https://www.harvarddesignmagazine.org/articles/ai-as-a-design-medium-rodenbeck/
1•vinhnx•18m ago•0 comments

Scam calls hunt the lonely, not the gullible

https://pilgrimsage.substack.com/p/the-faceless-voice
1•momentmaker•19m ago•0 comments

Base Fatality List

https://bfl.baseaddict.com/
3•OsrsNeedsf2P•31m ago•0 comments

Investigating Trump's Stock Trades and Public Announcement Timings [video]

https://www.youtube.com/shorts/LEIn9wHJTbw
1•Cider9986•33m ago•0 comments

Syllaby

https://syllaby.io/
1•Austinrmstrong•33m ago•0 comments

New Project Announcement: Radar Pipeline

https://github.com/omid2007hope/My-best-works
2•omid2007hope•40m ago•0 comments

Alexander Grothendieck Revolutionized 20th-Century Mathematics

https://www.quantamagazine.org/how-alexander-grothendieck-revolutionized-20th-century-mathematics...
3•anujbans•45m ago•0 comments

How I Host

https://davepeck.org/2026/05/23/how-i-host/
3•davepeck•46m ago•0 comments

Lenovo releases new 14-inch ThinkPad laptop with up to 120 Hz OLED and 96 GB RAM

https://www.notebookcheck.net/Lenovo-releases-new-14-inch-laptop-with-up-to-120-Hz-OLED-and-96-GB...
2•teleforce•47m ago•2 comments

From Vibe Coding to AI-Assisted Engineering: Lessons from Real Projects

https://medium.com/@eritonsilva/from-vibe-coding-to-ai-assisted-engineering-lessons-from-real-pro...
3•Eritsil•50m ago•0 comments

Micron Virginia expansion advances DDR4 memory made in the USA

https://www.micron.com/us-expansion/va
3•walterbell•53m ago•1 comments

Linus Torvalds: Linux 7.1-rc4

https://lkml.org/lkml/2026/5/17/896
7•vnykmshr•1h ago•1 comments

Who Is the King of Memorial Day Weekend Movies? We Ranked the Biggest Stars

https://editorial.rottentomatoes.com/article/memorial-day-weekend-movie-stars-ranked/
3•evo_9•1h ago•0 comments

Anthropic Says Mythos Has Found More Than 10k Vulnerabilities

https://www.engadget.com/2180028/anthropic-claude-mythos-preview-project-glasswing-update/
3•jonbaer•1h ago•3 comments

Clickup Reduced Headcount by 22%

https://twitter.com/DJ_CURFEW/status/2057522382315929802
4•ankit84•1h ago•2 comments

Show HN: Slow Code, a monthly meetup to practice coding by hand

3•virgil_disgr4ce•1h ago•1 comments

Video Generator

https://github.com/BestGiter/VideoGenerator
2•b3stc0der•1h ago•0 comments

Role of Reconstruction in the Inertness of Gold Toward Oxygen

https://journals.aps.org/prl/abstract/10.1103/g3bc-t1qv
2•bookofjoe•1h ago•0 comments

SF Bay Area Webcams

https://sfcam.live/
3•striking•1h ago•0 comments

Applying metaphors from other fields into software development

https://codeutopia.net/blog/2026/05/23/applying-metaphors-from-other-fields-into-software-develop...
2•nreece•1h ago•0 comments

SSV: Sparse Speculative Verification for Efficient LLM Inference

https://arxiv.org/abs/2605.19893
3•matt_d•1h ago•0 comments

Characterizing Real-World Bugs in Tile Programs for Automated Bug Detection

https://arxiv.org/abs/2605.19652
2•matt_d•2h ago•0 comments

The day my ping took countermeasures

https://blog.cloudflare.com/the-day-my-ping-took-countermeasures/
12•moonleay•2h ago•3 comments

Trump Post Shows Him Gazing at Greenland After Local 'No Means No' Protests

https://www.newsweek.com/trump-post-shows-him-gazing-at-greenland-after-local-no-means-no-protest...
5•SilverElfin•2h ago•0 comments

Characterization of machine learning compilers for LLM inference on NVIDIA GPUs

https://link.springer.com/article/10.1007/s11227-026-08559-6
3•matt_d•2h ago•0 comments

Shannon Got AI This Far. Kolmogorov Shows Where It Stops

https://medium.com/@vishalmisra/shannon-got-ai-this-far-kolmogorov-shows-where-it-stops-c81825f89ca0
3•dnw•2h ago•0 comments