frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•7mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

The Anatomy of a Triton Attention Kernel

https://arxiv.org/abs/2511.11581
2•PaulHoule•1m ago•0 comments

The missing standard library for multithreading in JavaScript

https://github.com/W4G1/multithreading
3•W4G1•2m ago•0 comments

Unity 6.3 LTS is now available

https://unity.com/blog/unity-6-3-lts-is-now-available
1•binarynate•2m ago•0 comments

Show HN: Vibe Code WP Plugins

https://steem.dev/
1•fasthightimess•3m ago•0 comments

GLP-1 Drugs, Psilocybin Mushrooms, and the Case for Sublingual Psilocin

https://psychedelicstoday.com/2025/08/05/glp-1-drugs-psilocybin-mushrooms-and-the-case-for-sublin...
1•toomuchtodo•4m ago•0 comments

Give Back to Blender – Fundraiser for 2026

https://www.blender.org/news/give-back-to-blender-fundraiser-for-2026/
1•throwaway2027•5m ago•0 comments

OpenAI must hand over 20M ChatGPT logs in New York Times lawsuit

https://www.windowscentral.com/artificial-intelligence/openai-chatgpt/judge-forces-openai-to-prod...
3•samspenc•8m ago•0 comments

Radicalized Anti-AI Activist Should Be a Wake Up Call for Doomer Rhetoric

https://www.techdirt.com/2025/12/05/radicalized-anti-ai-activist-should-be-a-wake-up-call-for-doo...
3•artninja1988•11m ago•0 comments

DRAM Price fixing scandal (2002)

https://en.wikipedia.org/wiki/DRAM_price_fixing_scandal
3•Flux159•14m ago•1 comments

I want a better build executor

https://jyn.dev/i-want-a-better-build-executor/
1•todsacerdoti•16m ago•0 comments

Building a Community Focused Data Layer for a State's Tech Ecosystem

https://michigan-pulse.com/
1•sieep•16m ago•0 comments

The truth about mass migration [video]

https://www.youtube.com/watch?v=QoFLHx-t-Yk
1•pinkmuffinere•17m ago•0 comments

The Path to Mojo 1.0

https://www.modular.com/blog/the-path-to-mojo-1-0
2•mdunnoconnor•19m ago•0 comments

Chess LLM Benchmark: Evaluating LLMs' ability to play chess

https://github.com/lightnesscaster/Chess-LLM-Benchmark
2•dwohnitmok•22m ago•0 comments

KDE Backups and Permissions

https://taonaw.com/2025/12/04/kde-backups-and-permissions.html
1•abnercoimbre•26m ago•0 comments

Launch a Docs MCP Server for Your Users in One Click

https://www.kapa.ai/blog/build-an-mcp-server-with-kapa-ai
1•mooreds•27m ago•0 comments

AI Agents Do Weird Things (and what to do about it)

https://www.dbos.dev/blog/ai-agents-do-weird-stuff-and-how-to-fix-it
1•KraftyOne•27m ago•0 comments

Show HN: Bible Note Journal – AI transcription and study tools for sermons (iOS)

https://www.biblenotejournal.com/
2•tfreebern2•28m ago•1 comments

Judge Signals Win for Software Freedom Conservancy in Vizio GPL Case

https://fossforce.com/2025/12/judge-signals-win-for-software-freedom-conservancy-in-vizio-gpl-case/
22•speckx•29m ago•0 comments

NVIDIA CUDA Tile programming model

https://developer.nvidia.com/blog/focus-on-your-algorithm-nvidia-cuda-tile-handles-the-hardware/
2•tanelpoder•29m ago•0 comments

How I deploy serverless stacks with Terraform

https://www.processfoundry.io/insights/how-i-deploy-serverless-stacks-with-terraform
1•christian-scott•30m ago•0 comments

We Gave Students Laptops and Took Away Their Brains

https://www.thefp.com/p/we-gave-students-laptops-and-took
3•sorenKaram•31m ago•0 comments

Three Asymmetric Divisions of the Octave (1996)

https://www.wendycarlos.com/resources/pitch.html
2•jstrieb•34m ago•0 comments

SPC Requests for Curiosity, Winter 2025

https://minusone.com/articles/spc-requests-for-curiosity-winter-2025
1•captn3m0•36m ago•0 comments

Cardinal: Fastest file searching tool for macOS

https://github.com/cardisoft/cardinal
1•pbowyer•36m ago•0 comments

Show HN: Musicians! Build arrangements for songs with click tracks

https://www.arrangement.app/
4•adamthehorse•44m ago•0 comments

People Should Be Smaller

https://namelessvirtue.com/2025/11/06/people-should-be-smaller/
1•surprisetalk•45m ago•0 comments

Roger Ebert's Relationship to Anime [video]

https://www.youtube.com/watch?v=oMdddKTJnQ8
1•surprisetalk•46m ago•0 comments

Tutorial 48: my museum collections kit

https://svpow.com/2025/11/26/tutorial-48-my-museum-collections-kit/
1•surprisetalk•46m ago•0 comments

Why It Accidentally Got Called Nano Banana [video]

https://www.youtube.com/watch?v=T5kaBzDThkY
1•nowflux•46m ago•0 comments