frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Show HN: Language Learning Roguelike Game

https://language.aitida.com
1•howthisends•3m ago•0 comments

Fable5-world-demo: 3D world built by Claude Fable 5 using Three.js

https://github.com/Braffolk/fable5-world-demo
1•Garbage•5m ago•0 comments

Nearly a Million Investors Lost a Total of $3.8B on Trump Crypto Coin

https://www.nytimes.com/2026/07/04/us/politics/nearly-a-million-investors-lost-a-total-of-3-8-bil...
1•_tk_•6m ago•0 comments

The Purgatory Job Market of 2026

https://www.nytimes.com/2026/07/04/opinion/purgatory-job-market.html
1•littlexsparkee•6m ago•0 comments

Show HN: Mycelium – AI agent plugin guiding you from purpose to market

https://github.com/haabe/mycelium
1•haabe•8m ago•0 comments

Intelligence Agencies, a Fight over Building a Master List of Spies

https://www.nytimes.com/2026/06/29/us/trump-intelligence-agencies-spies-master-list.html
1•JumpCrisscross•14m ago•0 comments

Division Polynomials of Elliptic Curves in Python

https://leetarxiv.substack.com/p/division-polynomials-of-elliptic
1•thunderbong•14m ago•0 comments

AI-powered mass emails are warping local politics

https://www.msn.com/en-us/news/technology/ai-powered-mass-emails-are-warping-local-energy-climate...
3•rwmj•16m ago•0 comments

AI search could kill the web without new quality signals and revenue models

https://theguptalog.blogspot.com/2026/07/ai-search-could-kill-web-without-new.html
1•bhartipoddar•20m ago•0 comments

The Graduate-School Dropout Toppling a Country's Academic Stars

https://www.wsj.com/science/the-graduate-school-dropout-toppling-chinas-academic-stars-3c1e5d86
1•ilamont•21m ago•1 comments

Hyperia 0.15.20 released: Terminals for Agents and Humans

https://github.com/DeepBlueDynamics/hyperia/releases
1•kordlessagain•21m ago•1 comments

More than 425,000 children in ICE immigration courts are representing themselves

https://www.independent.co.uk/news/world/americas/us-politics/ice-immigration-court-children-lawy...
2•dataflow•22m ago•0 comments

Flexible Metaprogramming with Rhombus

https://lwn.net/Articles/1079001/
8•leephillips•23m ago•0 comments

AI code reviewer with senior-level judgment and strict rubric

https://github.com/aisona-lab/lazycoder
1•aisonalab•24m ago•0 comments

How to Read Postgres Explain: A Guide to Scan Types

https://www.crunchydata.com/blog/postgres-scan-types-in-explain-plans
1•winslett•26m ago•0 comments

The cancer Alzheimer's disease paradox

https://www.nature.com/articles/s41514-026-00442-1
2•bookofjoe•28m ago•0 comments

The Dark Forest Applied to AI

https://github.com/thansz137/asiyah-protocol/blob/main/essays/dark_forest_of_minds.md
1•thansz•30m ago•0 comments

Chamath is an AI consultant now

https://www.aienablementinsider.com/p/chamath-8090-labs-consultant
2•dylancollins•32m ago•0 comments

Return of RSS Feeds

2•rbc•34m ago•0 comments

The Agent Harness: Runtime, Not Prompt Engineering, Defines Production Agents

https://guibai.dev/a/7657737434764148755/en/
1•Soarez•35m ago•0 comments

Show HN: IchiPitchy – browser-based vocal editor and multitrack mixer

https://www.ichipitchy.com/
1•a_self_explorer•37m ago•0 comments

Show HN: Convert Android Motion Photos to Apple Live Photos

https://apps.apple.com/sg/app/pinwheel/id6781777754?mt=12
1•vincentneo•37m ago•0 comments

EF Core 11 makes your split queries faster

https://steven-giesel.com/blogPost/d4401fd0-805a-4703-9d9e-5fe3b57c25ea
1•rellem•46m ago•0 comments

The Fediverse Is Not the Way Forward

https://trialandfailure.net/the-fediverse-is-not-the-way-forward/
31•ExMachina73•50m ago•2 comments

Show HN: Owthorize: catch destructive AI-agent tool calls before they run

https://www.npmjs.com/package/owthorize
1•ayushpawar•51m ago•0 comments

Show HN: Open-source no-code back end platform, now with AI flow generation

https://codezero.build/en/blog/0.0.0-canary-2651542634
1•nicosammito•53m ago•0 comments

The Message from Deep Space

https://store.steampowered.com/app/4080030/The_Message_from_Deep_Space/
1•vntok•53m ago•0 comments

Codifying the Rules: Building the Platform Behind the Agentic SDLC

https://blog.owulveryck.info/2026/07/02/sdlc-team-topologies.html
1•owulveryck•56m ago•0 comments

Show HN: A 'what you see is what you get' HTML editor for Mac

https://htmledit.io/
1•rtills•57m ago•0 comments

Show HN: Foundation, a different approach to software and AI

https://github.com/nmxmxh/foundation
3•MomohNobert•57m ago•0 comments