frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•9mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

New Apple AirTag Teardown

https://bsky.app/profile/stacksmashing.bsky.social/post/3mdi6jbojr224
1•hasheddan•1m ago•0 comments

Conjuring portals with real-time video and Gaussian Splats

https://app.daydream.live/creators/yondonfu/conjuring-portals-with-real-time-video-and-gaussian-s...
1•chaghalibaghali•3m ago•0 comments

Meta-Corning $6B fiber deal signals a new bottleneck in AI infrastructure

https://www.networkworld.com/article/4123460/meta-corning-fiber-deal-signals-a-new-bottleneck-in-...
1•giuliomagnifico•3m ago•0 comments

Amazon to Lay Off Around 16,000 Corporate Employees

https://www.wsj.com/tech/amazon-to-lay-off-around-16-000-corporate-employees-932df0be
3•JumpCrisscross•4m ago•0 comments

The Problem You Solve Is More Important Than the Code You Write

https://medium.com/@fagnerbrack/the-problem-you-solve-is-more-important-than-the-code-you-write-d...
1•fagnerbrack•7m ago•0 comments

"Corporate power" doesn't mean anything

https://www.slowboring.com/p/corporate-power-doesnt-mean-anything
1•amadeuspagel•9m ago•0 comments

443M-year-old fossils reveal early vertebrate eyes

https://www.manchester.ac.uk/about/news/443-million-year-old-fossils-reveal-early-vertebrate-eyes/
1•gnufx•11m ago•0 comments

3D Visualization of an GPT-Style LLM

https://github.com/bbycroft/llm-viz
1•onurkanbkrc•12m ago•0 comments

RQ Money: open-source personal finace software made in Lazarus

https://www.rqmoney.eu/index.html
1•userSumo•14m ago•0 comments

GM is quietly becoming a subscriptions company

https://www.businessinsider.com/general-motors-gm-earnings-subscriptions-revenue-business-2026-1
1•cebert•16m ago•0 comments

A Timeline of Interesting Takes on Agentic AI Use in Software Engineering

https://www.appsoftware.com/blog/a-timeline-of-interesting-takes-on-agentic-ai-capability-in-soft...
1•appsoftware•21m ago•0 comments

Week 3: The Trojan Horse – OpenChaos Blog

https://blog.openchaos.dev/posts/week-3-the-trojan-horse
1•Daviey•22m ago•0 comments

When Zero‑Width Isn't Zero

https://www.thedroidsonroids.com/blog/when-zero-width-isnt-zero
1•submiter_dor•24m ago•1 comments

Show HN: Dripemails.org - completely free drip email campaigns, from a YC alumni

https://dripemails.org/
1•alexS•25m ago•0 comments

Sweden weighs Franco-British nuclear weapons cooperation

https://breakingdefense.com/2026/01/sweden-eyes-franco-british-nuclear-weapons-cooperation/
5•saubeidl•26m ago•0 comments

What whale timing reveals across tokens and NFTs

https://kettaro.com/
1•chainbuilder•26m ago•1 comments

Andara Game

https://andara-game.netlify.app/
1•stvkoch•27m ago•2 comments

Show HN: VNOL – The Vendor-Neutral Cognitive OS Layer for Agent Portability

1•grrajan•29m ago•0 comments

Vanishing Act: Barbara Follett, transfixed the literary world-then vanished

https://www.laphamsquarterly.org/celebrity/vanishing-act
1•Popeyes•29m ago•0 comments

WhatsApp may soon require paid subscription for ad-free experience

https://www.notebookcheck.net/WhatsApp-may-soon-require-paid-subscription-for-ad-free-experience....
2•teleforce•30m ago•1 comments

Mail Art Didn't End with the Digital Age

https://www.finebooksmagazine.com/issue/mail-art-didnt-end-digital-age
1•bryanrasmussen•30m ago•0 comments

Code Review for Teams That Ship

https://www.blundergoat.com/articles/code-review-for-teams-that-ship
1•blundergoat•31m ago•0 comments

Unpublished Louisa May Alcott Letters and Manuscript

https://www.finebooksmagazine.com/fine-books-news/unpublished-louisa-may-alcott-letters-and-manus...
1•bryanrasmussen•31m ago•1 comments

Pipelining and prefetching: a 45% speedup story

https://sebastiano.tronto.net/blog/2026-01-28-prefetch/
1•sebtron•32m ago•0 comments

Amazon to cut 16,000 jobs globally to streamline operations

https://news.sky.com/story/amazon-to-cut-16-000-jobs-globally-to-streamline-operations-13490208
1•austinallegro•35m ago•0 comments

The cancer-causing gas hiding in homes [video]

https://www.youtube.com/watch?v=PLYMBdJ5SvI
2•mgh2•36m ago•0 comments

The Cognitive Architecture of Learning: Information Flow During Math

https://twitter.com/justinskycak/status/2016207332208763140
1•ibobev•39m ago•0 comments

The single most important thing that made me believe AI coding could work

https://rubyonai.com/the-single-most-important-thing-that-made-me-believe-ai-coding-could-work/
2•marcinos•40m ago•0 comments

Kenneth Lane Thompson, 1983 ACM Turing Award Recipient (Video Interview)

https://www.youtube.com/watch?v=309siTvApbY
1•Imustaskforhelp•41m ago•0 comments

General Motors' core profit rises on higher demand for SUVs, pickup trucks

https://www.reuters.com/business/autos-transportation/general-motors-core-profit-rises-higher-dem...
1•csomar•41m ago•0 comments