frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•10mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Uploading Pirated Books via BitTorrent Qualifies as Fair Use, Meta Argues

https://torrentfreak.com/uploading-pirated-books-via-bittorrent-qualifies-as-fair-use-meta/
1•askl•2m ago•0 comments

Show HN: Spectra – local finance dashboard with offline ML categorization

https://www.withspectra.app/
1•francesco_gab•4m ago•0 comments

Cloudflare-Native Starter Kits

https://greeff.dev/starter-kits
1•pio_greeff•6m ago•0 comments

Show HN: Pre-Launch – $15/Mo Status Page (Vs Atlassian $299) – Join Waitlist

2•Powellfgn•12m ago•0 comments

Hetzner bans website for 'violating terms'

https://twitter.com/tyleraloevera/status/2030064144980873434
1•timedude•16m ago•1 comments

Show HN: µJS, a 5KB alternative to Htmx and Turbo with zero dependencies

https://mujs.org
1•amaury_bouchard•22m ago•0 comments

NASA's Dart Mission Changed Orbit of Asteroid Around Sun

https://www.jpl.nasa.gov/news/nasas-dart-mission-changed-orbit-of-asteroid-didymos-around-sun/
2•merksittich•22m ago•0 comments

How to Untwist Your Fractions

https://mathvoices.ams.org/featurecolumn/2026/03/01/how-to-untwist-your-fractions/
1•uamuamuam•24m ago•0 comments

The Internals of PostgreSQL

https://www.interdb.jp/pg/
2•BinaryIgor•26m ago•0 comments

QGIS 4.0

https://changelog.qgis.org/en/version/4.0/
3•jonbaer•27m ago•0 comments

Microsoft is the carbon removal market

https://www.latitudemedia.com/news/microsoft-is-the-carbon-removal-market/
1•PaulHoule•28m ago•0 comments

Show HN: RAM Fear Greed Index

https://pcindex.app/
2•flordaman•35m ago•0 comments

I built a structured system design interview prep roadmap with progress tracking

1•shalhan•35m ago•0 comments

Show HN: Qarapace – GCP IAM reviews with persistent decisions and audit trails

https://qarapace.com/
1•gjanvier•36m ago•0 comments

Are we still ignoring cheating candidates?

1•shashahchk•36m ago•0 comments

Gouse–Toggle 'declared and not used' errors in Go

https://github.com/vipkek/gouse
1•looshch•37m ago•1 comments

Ask HN: AI agents in Slack can write but can't remember. Anyone else?

1•abel-ko•38m ago•0 comments

Alibaba AI initiates hacking and cryptomining activities unprompted

https://twitter.com/alexanderlong/status/2030022884979028435
2•MrBuddyCasino•39m ago•0 comments

Show HN: Hallucination Daily – AI newspaper where every writer is a named bot

https://hallucinationdaily.com/
1•ArchieDotEXE•40m ago•1 comments

A simple AI content scanner I built

1•demonlord_•42m ago•0 comments

Do Psychics Help Solve Crimes? [video]

https://www.youtube.com/watch?v=9jvXlJtgS7A
1•nomilk•44m ago•0 comments

ClawChain: L1 Blockchain for AI Agents – Testnet Live with 12 Pallets

https://github.com/clawinfra/claw-chain/discussions/62
1•AlexChen31337•44m ago•0 comments

Show HN: Micro Chat: Group Chat with AI

https://github.com/micro/chat
1•asim•45m ago•0 comments

Show HN: Affiliate programs from 1250 European infrastructure providers

https://voie.fi/affiliate-programs
1•Anokma•46m ago•1 comments

Show HN: RabbitHole %

https://github.com/ChameleonTeaming/rabbithole
1•ChameleonTeam•50m ago•0 comments

Size Does Matter: Why -OS Beat -O2 on My ESP32-S3

https://www.youtube.com/watch?v=cqHH2NXcf5E
1•iamflimflam1•55m ago•0 comments

Show HN: Making Braindance from Cyberpunk 2077 a reality

https://www.braindance.dance/
4•shibo•1h ago•0 comments

Show HN: Git-lanes – Parallel isolation for AI coding agents using Git worktrees

https://github.com/bugrax/git-lanes
4•bugrax•1h ago•3 comments

Show HN: OculOS – Any desktop app as a JSON API via OS accessibility tree

https://github.com/huseyinstif/oculos
2•stif1337•1h ago•0 comments

Ask HN: Best way to implement logging and audit trails for AI apps?

3•devstatic•1h ago•0 comments