frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•11mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Show HN: Clawcast – A peer-to-peer podcast network for agents

https://www.clawcast.dev/
1•PiersonMarks•1m ago•0 comments

John Maynard Keynes: Newton, the Man

https://mathshistory.st-andrews.ac.uk/Extras/Keynes_Newton/
1•ericmay•1m ago•0 comments

Show HN: I built a database for AI agents

https://github.com/DinobaseHQ/dinobase
1•Kappa90•2m ago•0 comments

DNA robots could deliver drugs and hunt viruses inside your body

https://openyourmindabretumente.blogspot.com/2026/04/dna-robots-could-deliver-drugs-and-hunt_0138...
1•ericzapata•3m ago•0 comments

OpenSSH 10.3/10.3p1 Release Notes

https://www.openssh.org/txt/release-10.3
1•throw0101c•4m ago•1 comments

Studying Human Attitudes Towards Robots Through Experience

https://openyourmindabretumente.blogspot.com/2026/04/studying-human-attitudes-towards-robots_0169...
1•ericzapata•5m ago•0 comments

Stablecoins are quietly reinforcing US dollar dominance

https://verda.ventures/how-america-can-maintain-the-dollar-hegemony/
1•sevenfoldnancy•6m ago•0 comments

Show HN: C64 Ultimate Toolbox for macOS

https://github.com/amiantos/c64-ultimate-toolbox
1•amiantos•7m ago•0 comments

Ask HN: How do you promote apps which are vibe coded but has real life usecase?

1•faiyaz26•7m ago•0 comments

The Blueprint of a North Korean Attack on Open-Source

https://casco.com/blog/the-blueprint-of-a-north-korean-attack-on-open-source
4•brene•8m ago•1 comments

Seekdb M0: Persistent Cloud Memory and Shared Experience for OpenClaw Agents

https://en.oceanbase.com/blog/26635690496
2•calweng•8m ago•0 comments

Is Telehealth Safe?

https://www.kaspersky.com/blog/telehealth-issues-2026/55560/
1•salkahfi•8m ago•0 comments

Supabase vs. Convex

https://www.devtoolsacademy.com/blog/supabase-vs-convex/
2•alokDT•8m ago•0 comments

Show HN: I had no idea I twirled my hair 25 times an hour until my Mac told me

https://www.ticticboom.app/
2•haberdasher•9m ago•0 comments

Show HN: Interactive object storage cost calculator

https://storage.mixpeek.com
1•Beefin•9m ago•0 comments

Live Rocket Telemetry and Logging in Two Weeks

https://wilsonharper.net/projects/avio/
1•WilsonHarper•10m ago•0 comments

Seekdb M0: Persistent Cloud Memory and Shared Experience for OpenClaw Agents

https://oceanbase.medium.com/how-seekdb-m0-gives-openclaw-persistent-memory-and-shared-experience...
1•calweng•11m ago•0 comments

Rescuing old printers with an in-browser Linux VM bridged to WebUSB over USB/IP

https://printervention.app/details
2•gmac•12m ago•0 comments

Are We Legacy Computing Yet?

https://arewelegacycomputingyet.com/
1•tyoverby•12m ago•0 comments

A clothes hanger invented by a mechanical engineer

https://www.kangaroohanger.com
1•samdung•12m ago•0 comments

We no longer write any of the prompts in our codebase

https://gist.github.com/milstan/3b12f938f344f4ae1f511dd19e56adce
1•milstan•13m ago•1 comments

GLM-5.1: The Next Level of Open Source

https://twitter.com/Zai_org/status/2041550153354519022
1•zixuanlimit•14m ago•3 comments

A star scientist showed that better genetics lessons could reduce racism

https://www.statnews.com/2026/04/07/brian-donovan-fighting-racism-with-genetics-education/
2•mooreds•16m ago•0 comments

The proliferation of AI-enabled military technology in the Middle East

https://www.iiss.org/online-analysis/charting-middle-east/2026/04/the-proliferation-of-ai-enabled...
1•CrypticShift•19m ago•0 comments

Lessons from Fitness Wearable Company WHOOP's FDA Warning Letter

https://pmc.ncbi.nlm.nih.gov/articles/PMC12822547/
1•randycupertino•20m ago•0 comments

WildDet3D: Open model that look at a single photo and understand objects in 3D

https://allenai.org/blog/wilddet3d
1•maxloh•20m ago•0 comments

Axios NPM supply chain incident

https://blog.talosintelligence.com/axois-npm-supply-chain-incident/
1•tcbrah•20m ago•0 comments

April Cools Club

https://www.aprilcools.club/
1•wxw•20m ago•0 comments

An Architectural Critique of OpenAI's Industrial Policy (RFC)

https://github.com/ariadne-coil/OpenAI-Industrial-Policy-RFC/blob/main/README.md
1•AriadneCyber•23m ago•0 comments

Time to Start Treating Dev Machines as Untrusted

https://worklifenotes.com/2026/03/31/time-to-start-treating-dev-machines-as-untrusted/
2•abnercoimbre•23m ago•0 comments