frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•9mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Show HN: Architect: A terminal for running multiple AI coding agents

https://github.com/forketyfork/architect
1•forketyfork•24s ago•0 comments

Parliament tells Dutch gov't to keep DigiD data out of American hands

https://nltimes.nl/2026/01/21/parliament-tells-dutch-govt-keep-digid-data-american-hands
1•TechTechTech•1m ago•0 comments

What If the Universe Is Inevitable? and Why I Spent Months Hating My Own Theory

https://pajuhaan.medium.com/what-if-the-universe-is-inevitable-and-why-i-spent-months-hating-my-o...
1•pajuhaan•1m ago•1 comments

CUDA Programming: From Zero to GPU Kernels – A Beginner's Guide

https://pythongiant.github.io/CUDA-From-Scratch/
1•pythongiant•1m ago•1 comments

The Most Important Teams in Tech

https://staysaasy.com/management/2026/01/15/the-most-important-teams-in-tech.html
1•thisismytest•2m ago•0 comments

Show HN: CyberCage – Control what data reaches AI tools without blocking them

https://cybercage.io/
1•cybercageio-dev•2m ago•2 comments

Everything Was Built by People No Smarter Than You", True?

1•danver0•3m ago•0 comments

Principled and Pragmatic: Canada's Path

https://www.pm.gc.ca/en/news/speeches/2026/01/20/principled-and-pragmatic-canadas-path-prime-mini...
1•TechTechTech•3m ago•0 comments

Show HN: Distilled 0.6B text-to-SQL model

https://github.com/distil-labs/distil-text2sql
1•maciejgryka•5m ago•0 comments

Wayland – Accessibility Input Protocol

https://gitlab.freedesktop.org/wayland/wayland-protocols/-/issues/149
1•shakna•5m ago•0 comments

EU inc: a new European company structure

https://ec.europa.eu/commission/presscorner/detail/da/speech_26_150
2•nhatcher•5m ago•0 comments

Convert your cool bear images into other cool bear images

https://github.com/JoshuaKasa/bearrb
1•JoshuaKasa•5m ago•1 comments

Blake

https://en.wikipedia.org/wiki/BLAKE_(hash_function)
1•tosh•7m ago•0 comments

MiKTeX – A Modern TeX Distribution

https://miktex.org/
1•smartmic•9m ago•0 comments

The Dobel Peace Prize Website

https://dobelpeaceprize.lovable.app/
1•LunarJungle•11m ago•0 comments

ChatGPT for CEOs, Founders and Executives

https://www.tryexecos.com/
1•vgmartinez•11m ago•0 comments

Implementation of a Sales Assistant Agent Using SerpApi Search and HubSpot

https://github.com/serpapi/sales-assistant-agent
1•jamescollinssp•13m ago•0 comments

Ask HN: I'm getting emails about trial accounts on sites I don't recognise

3•amihacked•13m ago•1 comments

Cowork AI

https://coworkai.app
1•bellamoon544•14m ago•1 comments

Show HN: Citizen Water Signal – A tool to make tap water issues visible (India)

https://www.citizensignal.site
1•eskimo87•14m ago•0 comments

Ask HN: Can someone make a CAS just checking last bit on x86/ARM please?

1•goofy_lemur•17m ago•0 comments

Peter Thiel's New Model Army

https://broligarchy.substack.com/p/peter-thiels-new-model-army
2•bryanrasmussen•18m ago•0 comments

EU–INC – One Europe. One Standard. – Pan-European Legal Entity

https://www.eu-inc.org/
2•tilt•18m ago•0 comments

Microsoft CEO warns that we must 'do something useful' with AI

https://www.pcgamer.com/software/ai/microsoft-ceo-warns-that-we-must-do-something-useful-with-ai-...
2•altern8•21m ago•1 comments

Recursive Islands and Lakes

https://en.wikipedia.org/wiki/Recursive_islands_and_lakes
1•merelysounds•22m ago•0 comments

Belarus begins a death penalty purge of radio amateurs

https://steanlab.medium.com/mayday-389f5713fee4
5•rendx•28m ago•0 comments

The applause for China at Davos is sincere

https://www.globaltimes.cn/page/202601/1353764.shtml
3•KnuthIsGod•28m ago•0 comments

ECB's Panetta sees digital commercial bank money in future

https://www.reuters.com/sustainability/boards-policy-regulation/ecbs-panetta-sees-fully-digital-c...
1•giuliomagnifico•31m ago•0 comments

Is 'open science' delivering benefits? Major study finds proof is sparse

https://www.science.org/content/article/open-science-delivering-benefits-major-study-finds-proof-...
1•JeanKage•32m ago•0 comments

Workers flee Cambodia scam centres, officials say

https://www.theguardian.com/world/2026/jan/21/thousands-of-workers-flee-cambodia-scam-centres-off...
1•speckx•33m ago•0 comments