frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•9mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Agent Development Kit (ADK-Go) v0.4.0

https://github.com/google/adk-go/releases/tag/v0.4.0
1•verdverm•1m ago•0 comments

Show HN: Is AI "good" yet? – tracking developer sentiment on AI coding

https://www.is-ai-good-yet.com/?per_page=50
1•ilyaizen•7m ago•0 comments

Show HN: Devin-CLI – The missing link for Agent-to-Agent orchestration

https://github.com/revanthpobala/devin-cli
1•revanth1108•9m ago•0 comments

Notepad++ hijacked by state-sponsored actors

https://notepad-plus-plus.org/news/hijacked-incident-info-update/
10•mysterydip•14m ago•1 comments

Trump plans to close Kennedy Center for two years for reconstruction work

https://www.reuters.com/world/us/kennedy-center-cease-entertainment-operations-two-years-trump-sa...
2•gehwartzen•17m ago•0 comments

Iran presidency releases names of those killed in anti-government protests

https://www.middleeastmonitor.com/20260201-iran-presidency-releases-names-of-those-killed-in-anti...
1•thisislife2•17m ago•0 comments

Show HN: Bunqueue – Job queue for Bun using SQLite instead of Redis

https://github.com/egeominotti/bunqueue
1•kernelvoid•18m ago•1 comments

Clang Hardening Cheat Sheet – Ten Years Later

https://blog.quarkslab.com/./clang-hardening-cheat-sheet-ten-years-later.html
1•PaulHoule•19m ago•0 comments

Show HN: Open-source, offline Kanban board with "swim lanes"

https://github.com/appsidekit/kanbanned
1•Intragalactic•20m ago•0 comments

Coordinating 10-agent teams with OpenClaw and shared persistent memory

https://twitter.com/pbteja1998/status/2017662163540971756
1•doanbactam•21m ago•1 comments

Show HN: Toktrack – 40x faster AI token tracker, rewritten from Node.js to Rust

https://github.com/mag123c/toktrack
1•mag123c•21m ago•0 comments

Lightweight Compression in DuckDB (2022)

https://duckdb.org/2022/10/28/lightweight-compression
1•vismit2000•24m ago•0 comments

Nanobot: Ultra-Lightweight Personal AI Assistant

https://github.com/HKUDS/nanobot
2•djhu9•24m ago•0 comments

Lodash's Security Reset and Maintenance Reboot

https://socket.dev/blog/inside-lodash-security-reset
1•todsacerdoti•25m ago•0 comments

How to Win Titular Metagames

https://taylor.town/how-to-title
2•eatitraw•28m ago•0 comments

The information concierge

https://aimilios.bearblog.dev/the-information-concierge/
1•minimalthinker•34m ago•0 comments

Emerging evidence on treating cluster headaches with DMT

https://forum.effectivealtruism.org/posts/x8P8EGnujSZm6fyMH/emerging-evidence-on-treating-cluster...
1•eatitraw•36m ago•0 comments

Shape-adaptive circuits based on liquid metal printed on thermoplastic films

https://www.nature.com/articles/s41928-025-01528-6
1•westurner•43m ago•1 comments

Show HN: Nod – Pre-code compliance validation for agentic coding workflows

https://github.com/mraml/nod
1•mraml•48m ago•0 comments

Show HN: Twitch Plays Pokémon" for Claude Code

https://claudecrowd.clodhost.com
4•zhoujianfu•50m ago•2 comments

IntentBound: Purpose-aware authorization for autonomous AI agents

1•Grokipaedia•53m ago•0 comments

The Worst of the Epstein Files – Our Corrupt Elites

https://www.youtube.com/watch?v=YPWIsfiNxPE
8•doener•55m ago•0 comments

Suggestion for a tool exceptionally needed for an excellent program

1•heyitsmoot•56m ago•0 comments

Motorola is getting away with zero OS updates thanks to regulatory loophole

https://www.androidauthority.com/motorola-eu-software-updates-loophole-3636627/
3•voxadam•58m ago•0 comments

Actors: A Model of Concurrent Computation [pdf]

https://apps.dtic.mil/sti/tr/pdf/ADA157917.pdf
7•kioku•1h ago•0 comments

Show HN: Democlean – Score robot demos by motion quality

https://github.com/dipampaul17/democlean
1•dipampaul17•1h ago•0 comments

Yann LeCun, an A.I. Pioneer, Warns the Tech 'Herd' Could Hit a Dead End

https://www.nytimes.com/2026/01/26/technology/an-ai-pioneer-warns-the-tech-herd-is-marching-into-...
7•gmays•1h ago•2 comments

Toronto man fakes pilot badge to score free flights, officials say

https://www.bbc.com/news/articles/c5y223170vdo
1•rguiscard•1h ago•0 comments

Tailwind creator Adam Wathan shares new project ui.sh

https://ui.sh/
1•cole_•1h ago•1 comments

Project Panama – Anthropic's plan to scan and dispose of millions of books

https://www.msn.com/en-us/news/technology/inside-one-company-s-secret-plan-to-destructively-scan-...
7•embedding-shape•1h ago•2 comments