frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•12mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

How did code handle 24-bit formats with video cards with bank-switched memory?

https://devblogs.microsoft.com/oldnewthing/20260420-00/?p=112245
1•ingve•1m ago•0 comments

Delete Act

https://en.wikipedia.org/wiki/Delete_Act
1•cainxinth•1m ago•0 comments

FP8 Search and KV-Caching in USearch

https://www.unum.cloud/blog/float8
1•ashvardanian•2m ago•0 comments

DensePose turns commodity WiFi signals into real-time human pose estimation

https://github.com/ruvnet/RuView
2•bigwheels•3m ago•0 comments

Highlights from Git 2.54

https://github.blog/open-source/git/highlights-from-git-2-54/
3•ingve•3m ago•0 comments

Fincept Terminal

https://github.com/Fincept-Corporation/FinceptTerminal
1•bigwheels•3m ago•0 comments

Yojam: A macOS default-browser shim that routes URLs through a rule engine

https://github.com/fluffypony/yojam
1•gurjeet•4m ago•0 comments

Show HN: I built an app to practice public speaking for ESL learners

https://www.orratio.com/
1•jakubb•5m ago•0 comments

Ask HN: A good self-hosted development platform for open source repositories

1•whateverboat•5m ago•0 comments

Multi-Gig Broadband for Techies (UK)

https://olilo.co.uk/
2•davemateer•7m ago•0 comments

Show HN: AI Coding Agent Guardrails enforced at runtime

https://sigmashake.com
1•cavalrytactics•8m ago•0 comments

Show HN: Pwneye – discovering and accessing IP cameras (ONVIF/RTSP)

https://github.com/Hackerest/pwneye
2•mcisternino•8m ago•0 comments

At Long Last, InfoWars Is Ours

https://theonion.info/
7•xnx•9m ago•0 comments

At Long Last, InfoWars Is Ours

https://theonion.com/at-long-last-infowars-is-ours/
15•HotGarbage•9m ago•1 comments

Google Eyes New Chips to Speed Up AI Results, Challenging Nvidia

https://www.bloomberg.com/news/features/2026-04-20/google-eyes-new-chips-to-speed-up-ai-results-c...
2•mfiguiere•10m ago•0 comments

Forbes Prediction Market Gamefies Story About Mass Shooting of 8 Children

https://www.404media.co/forbes-prediction-market-gamefies-story-about-mass-shooting-of-8-children/
4•cdrnsf•13m ago•0 comments

Sorry, Mary

https://mathenchant.wordpress.com/2026/04/18/sorry-mary/
1•ibobev•13m ago•0 comments

Computer Science Is a Trap for Smart People [video]

https://www.youtube.com/watch?v=XMn9hNerqZ8
3•da02•16m ago•1 comments

Screaming body, freezing temps – life lessons from a freediver

https://www.rnz.co.nz/life/people/screaming-body-freezing-temps-life-lessons-from-a-freediver
2•mooreds•17m ago•0 comments

Vector Pitfalls and Memory Management [video]

https://vorbrodt.blog/2026/04/19/san-diego-c-meetup-85-april-2026-edition-vector-pitfalls-and-mem...
1•ibobev•17m ago•0 comments

Multi merge sort, or when optimizations aren't

https://nibblestew.blogspot.com/2026/04/multi-merge-sort-or-when-optimizations.html
1•ibobev•18m ago•0 comments

One last trip to the internet in 2009 with The Rough Guide 14

https://www.planetjones.net/blog/19-04-2026/one-last-trip-to-the-internet-in-2009-with-the-rough-...
1•planetjones•18m ago•0 comments

Making Rayleigh-Bénard Convection Cells

https://chillphysicsenjoyer.substack.com/p/making-rayleigh-benard-convection
3•crescit_eundo•19m ago•0 comments

The debug loop is broken in robotics

https://www.robolens.to/manifesto
1•psavnani•19m ago•0 comments

Show HN: Seltz – The fastest, high quality, search API for AI agents

https://console.seltz.ai/login
4•amallia•21m ago•1 comments

The 18th-century English fake news that helped spawn an American sea

https://mapmyths.com/blog/de-fonte/
1•ohjeez•22m ago•0 comments

Cahokia

https://en.wikipedia.org/wiki/Cahokia
3•Amorymeltzer•24m ago•0 comments

Hack Monty, Win $5k: Inside PydanticAI's Challenge

https://pydantic.dev/articles/hack-monty
1•v-mdev•27m ago•0 comments

Laz's Wolfenstein 3D Page

http://lazrojas.com/wolf3d/
3•justsomehnguy•28m ago•0 comments

Colorado River disappeared record for 5M years: now we know where it was

https://phys.org/news/2026-04-colorado-river-geological-million-years.html
2•wglb•28m ago•1 comments