frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•11mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

LegionGo gamepad as a Claude Code controller (Konami code activates voice mode)

https://github.com/pinkpwningclub/legion-claude
1•pinkpwningclub•30s ago•0 comments

Amazon data center in Bahrain attacked by Iranian Revolutionary Guards

https://www.reuters.com/world/middle-east/amazons-cloud-business-bahrain-damaged-iran-strike-ft-r...
1•topherPedersen•1m ago•0 comments

Ask HN: Should there be a temporary ban on new accounts?

1•l33tbro•2m ago•0 comments

Artemis II astronaut reports that two Outlooks are running and neither works

https://old.reddit.com/r/pcmasterrace/comments/1sao0kh/artemis_ii_astronaut_reports_to_nasa_that_...
2•LookAtThatBacon•2m ago•0 comments

Iran attacks Oracle data center

https://www.jpost.com/middle-east/iran-news/article-891951
1•topherPedersen•3m ago•0 comments

Hydration Narc – A keyboard-locking water reminder for macOS

https://github.com/Lameda12/hydration-narc
1•Alamedin_ishere•3m ago•0 comments

AWS has officially removed all EC2 instances in Bahrain from their docs

https://twitter.com/astuyve/status/2039777883485254081
1•mirzap•5m ago•0 comments

Agent-Friendly Documentation Spec

https://agentdocsspec.com/
1•nlawalker•5m ago•0 comments

Tehran Accuses Ukraine of Active Participation in Iran War

https://www.kyivpost.com/post/72965
1•alephnerd•7m ago•0 comments

The Path Forward for WordPress 7.0

https://make.wordpress.org/core/2026/04/02/the-path-forward-for-wordpress-7-0/
1•jantissler•7m ago•0 comments

Are you still copy/pasting in GPT to correct your text?

https://rewritecmd.com/
1•Louis9•9m ago•0 comments

ON1 Restore AI Turns Old Family Photos into Grotesque Nightmare Fuel

https://petapixel.com/2026/03/25/on1-restore-ai-turns-old-family-photos-into-grotesque-nightmare-...
1•My_Name•9m ago•0 comments

Show HN: Run Claude Code, Codex, and more with auto-approval in a Container

https://github.com/VibePod/vibepod-cli
1•nezhar•9m ago•0 comments

Spyware Vendor Creates Fake WhatsApp App

https://securityaffairs.com/190276/malware/italian-spyware-vendor-creates-fake-whatsapp-app-targe...
1•lschueller•10m ago•0 comments

Velvetluxbox

https://bdsm-box.com/
1•Velvetluxbox•10m ago•0 comments

Show HN: Hey Aio – Video Editing Assistant You Can Talk To

https://www.heyaio.com/
1•misellu•12m ago•0 comments

Hallucinated citations are polluting scientific literature. What can be done?

https://www.nature.com/articles/d41586-026-00969-z
2•Kaibeezy•15m ago•2 comments

What the FCC router ban means for FOSS

https://sfconservancy.org/blog/2026/apr/02/fcc-router-ban/
1•speckx•16m ago•0 comments

The Friend Compound Field Guide

https://livenearfriends.com/fieldguide
1•cjbarber•17m ago•0 comments

Public Anger Is Rising

https://www.theatlantic.com/politics/2026/04/congress-government-shutdown-tsa/686653/
2•paulpauper•18m ago•0 comments

Open Source 13" E-Ink Spectra Color Display with ESP32-S3

https://www.crowdsupply.com/soldered/inkplate-13spectra
2•car•20m ago•0 comments

Show HN: I built an offline and privacy-first photo sorter

https://photosort-production-d4d1.up.railway.app/
2•Sensotix•21m ago•1 comments

Show HN: Is autoresearch better than classic hyperparameter tuning?

https://www.weco.ai/blog/autoresearch-vs-classical-hpo
2•WecoAI•21m ago•0 comments

Show HN: Claude Code leak forced this early – Dojo dual agents vs. Claude/Codex

https://ehaye.io
1•avidcoder•21m ago•0 comments

Epstein files: Buffett says he hasn't talked to Bill Gates

https://www.cnbc.com/2026/03/31/warren-buffett-bill-gates-epstein.html
1•paulpauper•21m ago•0 comments

Anthropic says its leak-focused DMCA unintentionally hit legit GitHub forks

https://arstechnica.com/ai/2026/04/anthropic-says-its-leak-focused-dmca-effort-unintentionally-hi...
1•josephcsible•23m ago•1 comments

Show HN: AutoLoop – Let coding agents run optimization loops on real repos

https://github.com/armgabrielyan/autoloop
1•armen99•23m ago•0 comments

Oldest known tortoise still alive, as reports of death revealed as hoax

https://www.bbc.com/news/articles/c393xmpzjwko
2•BeetleB•24m ago•0 comments

Marc Andreessen Is Right That AI Isn't Killing Jobs. Interest Rate Hikes Are

https://www.governance.fyi/p/marc-andreessen-is-right-that-ai
3•bigbobbeeper•25m ago•0 comments

Republic – the best way to monitor the situation in SF

https://republic.civlab.org/
1•m_adams•25m ago•0 comments