frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Ford execs say they made a mistake when they replaced human engineers with AI

https://www.neowin.net/news/ford-execs-say-they-made-a-mistake-when-they-replaced-human-engineers...
1•gscott•2m ago•0 comments

We measured whether AI obeys architecture rules. Even Opus ignored them 60%

https://hunch-pi.vercel.app/blog/post?slug=ai-ignores-your-architecture
1•davesheffer•3m ago•0 comments

Show HN: OpenClaw Launch – deploy a managed OpenClaw AI agent in 30s

https://openclawlaunch.com
1•zackchew•3m ago•0 comments

CraftsmanSHIP. Not CraftsmanSHIT

https://fagnerbrack.com/craftsmanship-not-craftsmanshit-88db7c982103
1•fagnerbrack•4m ago•0 comments

LFM2 VL WebGPU

https://huggingface.co/spaces/LiquidAI/LFM2-VL-WebGPU
1•fagnerbrack•4m ago•0 comments

Qwen3.5 WebGPU

https://huggingface.co/spaces/webml-community/Qwen3.5-WebGPU
1•fagnerbrack•4m ago•0 comments

Show HN: An LLM that designs M5Stack hardware builds

https://unitkit.pages.dev/
1•toyoshi•6m ago•0 comments

Explore the Leading Global Data Center Database

https://www.datacentermap.com/
1•zeristor•6m ago•0 comments

AirPlay for Google (Android) TV Chromecast

https://github.com/lekandigital/android-tv-airplay-receiver
1•lekan_digital•7m ago•0 comments

Emergent Geometry from the IKKT Matrix Model: Convergence to S^4

https://zenodo.org/records/19558001
1•RIshabh235•9m ago•0 comments

What Is an AI Intelligence Layer for Business Data?

https://www.corpusiq.io/blog/what-is-ai-intelligence-layer-business-data
1•corpusiq_io•10m ago•0 comments

Voxel Collision in SuperSplat

https://twitter.com/willeastcott/status/2070110425648128245
1•wildpeaks•20m ago•0 comments

Apple asks Trump admin to approve Chinese RAM after product price increases

https://9to5mac.com/2026/06/26/apple-asks-trump-admin-to-approve-chinese-ram-after-product-price-...
2•alwillis•21m ago•1 comments

Blue print to let machines think like humans

1•cysparrow•22m ago•0 comments

Sony's State of Play Showed That Every Publisher Is Terrified of GTA 6

https://kotaku.com/sonys-state-of-play-showed-that-every-publisher-is-terrified-of-gta-6-2000701746
1•classified•23m ago•0 comments

Ping pong: A game that requires a C compiler to play

https://www.ioccc.org/2025/uellenberg/index.html
1•thunderbong•24m ago•0 comments

IBM MCGA Gate Array Reverse Engineering

https://github.com/schlae/IBM_MCGA
2•userbinator•24m ago•0 comments

Testing 67 Models: Combining LLMs Rarely Beats the Best Single Model

https://huggingface.co/spaces/josefchen/orchestration-is-allocation
2•josefchen•25m ago•0 comments

Dan Petersen: Is this an worse time for a math career?

https://mathoverflow.net/questions/511484/is-this-an-even-worse-moment-for-a-math-career
1•reasonableklout•28m ago•0 comments

Apple Loses Another Top Executive to OpenAI

https://www.macrumors.com/2026/06/26/apple-loses-another-executive-to-openai/
3•mgh2•28m ago•0 comments

Is Germany looking again at coal-powered electricity?

https://www.bbc.com/news/articles/cy04ykxrj5eo
1•leonidasrup•30m ago•0 comments

'Fingerprints' of black hole's event horizon detected for first time

https://phys.org/news/2026-06-fingerprints-black-hole-event-horizon.html
1•signa11•36m ago•0 comments

5120x2160 100Hz Android Desktop Mode on Razor Fold

https://old.reddit.com/r/motorola/comments/1ufzwyt/i_just_got_5k2k_5120x2160_100hz_smoothly_on_a/
1•politelemon•44m ago•0 comments

AI glasses help students cheat in exams – test-obsessed Asia is ground zero

https://www.rnz.co.nz/news/world/628396/ai-glasses-are-helping-students-cheat-in-exams-test-obses...
3•billybuckwheat•48m ago•0 comments

Certainty Volatility Theory: Stable Uncertainty vs. Fluctuating Certainty

https://theguptalog.blogspot.com/2026/06/adaptive-certainty-theory-maybe-we-dont.html
1•GeorgeWoff25•52m ago•0 comments

Everything* – An interactive voyage through all orders of magnitude

https://www.everything.one/#overview
1•bj-rn•54m ago•0 comments

Lore: Next-generation open source version control

https://github.com/EpicGames/lore
2•taubek•1h ago•0 comments

Gambling disorder cases skyrockets in states where sports betting is legal

https://twitter.com/cremieuxrecueil/status/2070651565107446116
1•MrBuddyCasino•1h ago•0 comments

The New York Times Amends Lawsuit Against OpenAI and Microsoft

https://www.nytimes.com/2026/06/25/technology/times-lawsuit-openai-microsoft.html
1•1vuio0pswjnm7•1h ago•0 comments

Promptetheus – Trace, detect, and auto-repair AI agent failures

https://github.com/obro79/promptetheus
1•tar-ive•1h ago•1 comments