frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•10mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Show HN: Organic Programming – A .proto is all you need

https://github.com/organic-programming/seed
1•bpds•46s ago•0 comments

Apache Iggy's migration journey to thread-per-core design powered by io_uring

https://iggy.apache.org/blogs/2026/02/27/thread-per-core-io_uring/
1•spetz•2m ago•0 comments

LokulMem – Local-first memory management for browser LLMs

https://github.com/Pouryaak/LokulMem
1•Pouryaak•2m ago•1 comments

Show HN: OpportuAI – remote jobs, AI tools and digital products aggregator

https://opportunai.vercel.app
1•sakibulefty•2m ago•0 comments

Show HN: RetroTick – Run classic Windows EXEs in the browser

https://retrotick.com/
1•lqs_•2m ago•0 comments

Generative AI Use and Depressive Symptoms Among US Adults

https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2844128
1•pseudolus•6m ago•0 comments

Show HN: A Spatial Alternative to Timeline-Based Digital Memory

https://honoramma.com
1•pavel_man•8m ago•1 comments

The error handling bugs that worry me aren't the ones that crash

https://old.reddit.com/r/golang/comments/1rg5zo7/the_error_handling_bugs_that_worry_me_arent_the/
1•eik•8m ago•0 comments

Pallas Puzzles

https://github.com/vorushin/pallas_puzzles
1•burakabo•9m ago•0 comments

Show HN: Sugar – A task queue that lets AI coding agents work autonomously

https://github.com/roboticforce/sugar
1•cdnsteve•9m ago•0 comments

Chat Control is in the final stretch – but it could be a marathon, not a sprint

https://edri.org/our-work/chat-control-is-in-the-final-stretch-but-it-could-be-a-marathon-not-a-s...
1•nickslaughter02•9m ago•0 comments

Show HN: Globs – a daily puzzle about finding the hidden connections

https://threeemojis.com/en-US/play/globs/en-US/2026-02-27?size=big
1•knuckleheads•10m ago•0 comments

Iinit7: Bits and Bites #15

https://init7.friendlyautomate.ch/email/preview/377
1•sschueller•11m ago•0 comments

Jack Dorsey lays off 4k, says others will do same 'within the next year'

https://www.sfgate.com/tech/article/jack-dorsey-block-layoffs-21944033.php
1•taubek•11m ago•0 comments

How I Caught a Spy Using Her Cat (Bellingcat) [video]

https://www.youtube.com/watch?v=xjo0iLssbI8
1•Cloudly•12m ago•0 comments

How do you catch schema drift and security gaps in Firestore?

1•Madia120•12m ago•0 comments

McNamara Fallacy

https://en.wikipedia.org/wiki/McNamara_fallacy
1•meken•12m ago•0 comments

iOS and iPadOS 26 with Indigo Configuration

https://www.ia.nato.int/niapc/Product/iOS-and-iPadOS-26-with-Indigo-configuration_968
1•taubek•13m ago•0 comments

Show HN: PokeInvasion – Wild Pokémon appear on every website

https://github.com/IvanR3D/pokeinvasion_chrome-extension
1•IvanR3D•14m ago•1 comments

Hetzner Price Increase

https://www.hetzner.com/pressroom/statement-price-adjustment/
1•talboren•15m ago•0 comments

Who Believes in Vibe-Coding?

https://medium.com/ai-in-plain-english/who-believes-in-vibe-coding-1796fdd27b43
1•birdculture•16m ago•0 comments

Show HN: TAS – Tracking, Automation, and Skills for Claude Code

https://github.com/Voxos-ai-Inc/tas
1•Falimonda•16m ago•0 comments

Claude.ai Is Down

https://claude.ai/#
5•fagnerbrack•16m ago•5 comments

Viewert – AI User's Absolute Must Have

https://www.viewert.com
1•Sunrostern•20m ago•0 comments

Show HN: OSS Go client for signed agent-to-agent messaging in the ClaWeb network

https://github.com/awebai/aw
1•juanre•20m ago•0 comments

Ask HN: Continuous User-Sentiment Surveys?

1•adzicg•21m ago•0 comments

Training realtime video LoRAs for fun and profit

https://app.daydream.live/creators/thomshutt/training-loras-for-fun-and-profit
1•chaghalibaghali•24m ago•0 comments

Created `MCP-guard`, open MCP guarding tool

https://github.com/alramalho/mcp-guard
1•alramalho•24m ago•1 comments

UK's first geothermal power plant has been turned on

https://www.bbc.com/news/articles/cewzg77k721o
1•bill38•25m ago•0 comments

Snakes.run: rendering 100M pixels a second over SSH

https://eieio.games/blog//blog/secure-massively-multiplayer-snake/
1•fanf2•27m ago•0 comments