frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•9mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Gold tops $4,900/oz; silver and platinum extend record‑setting rally

https://www.reuters.com/world/india/gold-falls-easing-geopolitical-tensions-dampen-safe-haven-dem...
2•TMWNN•2m ago•0 comments

Surge Protectors: Marketing vs. Reality

https://chameth.com/surge-protectors-marketing-vs-reality/
1•Ariarule•2m ago•0 comments

Xiaomi Blames iPhone for EV Taking Off on Its Own

https://www.msn.com/en-us/lifestyle/shopping/xiaomi-blames-iphone-for-ev-taking-off-on-its-own/ar...
1•walterbell•3m ago•0 comments

What Happens When You Model Humanity as Data and Turn It into a Card Game

https://petridishtalk.com/2026/01/24/what-happens-when-you-model-humanity-as-data-and-turn-it-int...
1•pascalemarill•5m ago•0 comments

Frankenwine: Multiple Personas in a Wine Process

https://nullprogram.com/blog/2026/01/19/
1•jeffjeffbear•8m ago•1 comments

What Do Microbes Have to Do with How We Age? Everything

https://thewalrus.ca/what-do-microbes-have-to-do-with-how-we-age-everything-actually/
1•gmays•12m ago•0 comments

The Value of Things – Journal.stuffwithstuff.com

https://journal.stuffwithstuff.com/2026/01/24/the-value-of-things/
1•cratermoon•13m ago•0 comments

Musk vs. Altman

https://www.courtlistener.com/docket/69013420/379/75/musk-v-altman/
2•46493168•17m ago•0 comments

Episode 41 of lava fountaining begins on Kilauea

https://www.hawaiinewsnow.com/2026/01/24/episode-41-lava-fountaining-begins-kilauea/
1•cratermoon•18m ago•0 comments

A Complete Guide to Agents.md

https://www.aihero.dev/a-complete-guide-to-agents-md
1•SouravInsights•22m ago•0 comments

MCP is the New GraphQL

https://nadeeshacabral.com/posts/mcp-is-the-new-graphql/
2•lunarcave•22m ago•0 comments

Ask HN: How would you promote a song digitally?

1•radicalethics•24m ago•0 comments

Research After AI: Principles for Accelerated Exploration

https://gist.github.com/joelkuiper/d52cc0e5ff06d12c85e492e4295ca890
1•anon1253•26m ago•0 comments

Show HN: A tiny scalar that detects regression drift before CI tests fail

https://github.com/willshacklett/gv-drift-demo
1•PapaShack45•27m ago•0 comments

Career transition question – Assistance, MLOps guidance

1•Pierre_Esteves•30m ago•0 comments

ICE Releases RFI for User Tracking Tools

https://www.wired.com/story/ice-asks-companies-about-ad-tech-and-big-data-tools/
9•eoskx•31m ago•0 comments

The year is 1993, The Amiga is everything for me

https://old.reddit.com/r/amiga/comments/1qlw4ma/the_year_is_1993_the_amiga_is_everything_for_me/
2•doener•31m ago•0 comments

InsAIts Monitor AI agent communications for anomalies local, privacy-first

https://github.com/Nomadu27/InsAIts
1•MrSteaddy•35m ago•1 comments

Show HN: I embedded Claude inside a running Node app

https://github.com/genecyber/Reflexive
1•shannoncode•38m ago•0 comments

Surfaces with Klein bottle topology occur in fusion reactor fields

https://arxiv.org/abs/2506.11883
2•pizza•38m ago•1 comments

Lawmakers move to extend two cyber programs (again) in funding proposal

https://therecord.media/lawmakers-move-to-extend-two-cyber-programs-again
1•PaulHoule•42m ago•0 comments

What These Cockpit Lights Mean – ATR Simulator Walkthrough – Dark Cockpit

https://www.youtube.com/watch?v=Q7_PB6f2pqY
2•starkparker•45m ago•0 comments

Fuel Economy Fraud: Closing Loopholes That Increase U.S. Oil Dependence (2005) [pdf]

https://www.ucs.org/sites/default/files/2019-09/executive_summary_final.pdf
2•CGMthrowaway•47m ago•0 comments

Bio-Theory Lab Notes: Growth Rates and Worm Brains

https://chillphysicsenjoyer.substack.com/p/bio-theory-lab-notes
1•crescit_eundo•50m ago•0 comments

Grainrad: Browser ASCII/Dithering Tool

https://grainrad.com/
2•smusamashah•58m ago•0 comments

Markdown Viewer – Get This Extension for Firefox (En-US)

https://addons.mozilla.org/en-US/firefox/addon/markdown-viewer-extension/
2•dp-hackernews•59m ago•0 comments

Using Information Entropy to Make Choices / Choose Experiments

https://blog.demofox.org/2025/10/05/using-information-entropy-to-make-choices-choose-experiments/
2•deadbishop•59m ago•0 comments

Daxfs Proposed as Newest Linux File-System

https://www.phoronix.com/news/DAXFS-Linux-File-System
5•Bender•1h ago•0 comments

CachyOS Starts 2026 by Switching to Plasma Login Manager, Live ISO Using Wayland

https://www.phoronix.com/news/CachyOS-January-2026
6•Bender•1h ago•0 comments

OptiMind: Research Model Designed for Optimization

https://huggingface.co/blog/microsoft/optimind
1•gmays•1h ago•0 comments