frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•10mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Testing Super Mario Using a Behavior Model Autonomously – Finding Real Bugs

https://testflows.com/blog/testing-super-mario-using-a-behavior-model-autonomously-part2/
2•vzakaznikov•45s ago•1 comments

Show HN: Hacker Smacker – spot great (and terrible) HN commenters at a glance

https://hackersmacker.org
1•conesus•1m ago•0 comments

Behind the design: Adobe's updated app icons

https://adobe.design/stories/process/behind-the-design-adobe-s-updated-app-icons
1•eustoria•1m ago•0 comments

Show HN: Open-Weight Image-Video VAE (Better Reconstruction ≠ Better Generation)

https://www.linum.ai/field-notes/vae-reconstruction-vs-generation
2•schopra909•2m ago•1 comments

DJR Glyph Navigator

https://glyphs.djr.com/
1•eustoria•2m ago•0 comments

Show HN: PiQrypt – Cryptographic audit trail for AI agents (Ed25519, Dilithium3)

https://github.com/PiQrypt/piqrypt
1•PiQrypt_Fred•3m ago•0 comments

Show HN: Run any LLM inside Claude Code. A local auditable proxy for 7 providers

https://github.com/sarukas/claude-code-agent-sdk-router
1•sarunasch•3m ago•0 comments

Eventually, the Future Comes

https://twitter.com/ScottWu46/status/2026350958213787903
1•dabit3•3m ago•0 comments

Inflation and Crime

https://policykahani.substack.com/p/inflation-and-crime
1•WasimBhai•3m ago•0 comments

How to Stop a Dictator

https://www.vox.com/politics/479924/democracy-us-brazil-south-korea-poland-backsliding-resilience
3•only_in_america•4m ago•0 comments

How to Allocate Memory

https://geocar.sdf1.org/alloc.html
1•tosh•5m ago•0 comments

So You Want to Cure Your Own Disease – Using AI to Take Agency over Your Health

https://andrewjrod.substack.com/p/d3b534ca-0bd6-4809-bf8e-77132c7363eb
2•kurinikku•5m ago•0 comments

Lean-TUI for the lean proof assistant

https://codeberg.org/wvhulle/lean-tui
2•i_don_t_know•5m ago•0 comments

Qwen 3.5 small models out

https://huggingface.co/Qwen/Qwen3.5-35B-A3B
1•andhuman•6m ago•0 comments

Cursor agents can now control their own computers

https://cursor.com/blog/agent-computer-use
1•leerob•6m ago•0 comments

The Emerging Harness Engineering Playbook

https://www.ignorance.ai/p/the-emerging-harness-engineering
1•charlierguo•7m ago•0 comments

Show HN: Connector-OSS – Memory integrity kernel for AI agents

https://github.com/GlobalSushrut/connector-oss
1•umeshlamton•8m ago•0 comments

Acecursor: Unleash Your CapsLock's Potential

https://github.com/Borrus-sudo/acecursor
1•Borrus-sudo•9m ago•0 comments

AI Didn't Start the Fire: Stack Exchange Exit, Voice, and Loyalty

https://blog.communitydata.science/ai-didnt-start-the-fire-how-stack-exchange-moderators-and-user...
2•aendruk•9m ago•0 comments

Show HN: Trolley – Run terminal apps anywhere

https://github.com/weedonandscott/trolley
1•oDot•10m ago•0 comments

Terminal.shop – Unlimited Coffee: independent discovery of a 9.1 severity CVE

https://blog.nortonweb.org/terminal_free_coffee
1•OliverWich•11m ago•0 comments

Show HN: ClawForge – MDM for AI assistants (governance for OpenClaw)

2•clawforge•11m ago•0 comments

Sometimes it's just better to load "all" the data

https://blog.codingmilitia.com/2026/02/15/sometimes-its-just-better-to-load-all-the-data/
1•todsacerdoti•13m ago•0 comments

Selling Surplus or Scarcity: Where AI Margin Lives

https://medium.com/@krzysztof.dyki/selling-surplus-or-scarcity-where-ai-margin-actually-lives-34f...
2•hamedi•14m ago•0 comments

Adaptation of Video Foundation Model for Remote Photoplethysmography

https://www.mdpi.com/2076-3417/16/4/2038
1•PaulHoule•15m ago•0 comments

4-post blog by Brantley Coile on going flying with Niklaus Wirth

http://coraid.com/topic-the-flight.html
1•lproven•15m ago•0 comments

Tesla registrations crash 17% in Europe as BEV market surges 14%

https://electrek.co/2026/02/24/tesla-eu-registrations-crash-january-2026-bev-growth/
12•breve•15m ago•0 comments

Show HN: Dreamscroll – Replace bedtime doomscrolling with nature photos

https://apps.apple.com/us/app/dreamscroll/id6758167621
1•lemagee•16m ago•1 comments

SCOTUS rules USPS can't be sued, even when mail is intentionally not delivered

https://apnews.com/article/supreme-court-postal-service-missing-mail-7ce97a5b7d56373cdeaa6ecc9a91...
2•helterskelter•16m ago•0 comments

Show HN: Rent-a-Hater

https://rentahater.com/
1•B_R_G•17m ago•0 comments