frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•9mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Is time a fundamental part of reality? A quiet revolution in physics suggests no

https://theconversation.com/is-time-a-fundamental-part-of-reality-a-quiet-revolution-in-physics-s...
1•mikhael•2m ago•0 comments

Helix 02: Full-Body Autonomy

https://www.figure.ai/news/helix-02
1•pr337h4m•3m ago•0 comments

Listen to the AirTag's remixed new chime

https://www.macworld.com/article/3045497/listen-to-the-airtags-remixed-new-chime.html
1•01-_-•4m ago•0 comments

Grassroots Coalition Plans Nationwide Jan 30 Strike Against ICE

https://nationalshutdown.org
4•Lwrless•6m ago•0 comments

Show HN: TheBaduk – A Go/Baduk Platform Built with Vanilla JavaScript

https://thebaduk.com
1•bugon•8m ago•0 comments

The Great British Treasure Hunt

https://www.royalmint.com/shop/limited-editions/the-great-british-treasure-hunt/
1•helsinkiandrew•8m ago•0 comments

Test your interpretability techniques by de-censoring Chinese models

https://www.lesswrong.com/posts/7gp76q4rWLFi6sFqm/test-your-interpretability-techniques-by-de-cen...
1•allenleee•9m ago•0 comments

The Only Moat Left Is Knowing Things

https://growtika.com/blog/authenticity-edge
2•Growtika•10m ago•0 comments

It is incorrect to "normalize" // in HTTP URL paths

https://runxiyu.org/comp/doubleslash/
1•birdculture•11m ago•0 comments

Analysis: The Mathematics Behind Pokemon Type Combinations

1•lincyang•13m ago•0 comments

The GenAI era started. Just a few words in essays

http://codrutapoenaru.blogspot.com/
1•pcodruta•19m ago•0 comments

AI-powered stock analysis in Python

https://pypi.org/project/investormate/
1•Siddartha_19•20m ago•0 comments

Sonofield Key Seeker – identify the key of a song by ear

https://sonofield.com/apps/key-seeker/songs/prog-metal-2
1•mschnell•21m ago•0 comments

Ask HN: What if AI agents were just infrastructure you plugged in like a DB?

1•KatkaV•23m ago•0 comments

Love it or hate it, Windows 11 has reached 1B users faster than 10

https://www.neowin.net/news/love-it-or-hate-it-windows-11-has-reached-one-billion-users-faster-th...
1•bundie•24m ago•0 comments

Show HN: SwiftMock – automatic mock generation for Swift

https://www.swiftmock.com
1•gokulnair2001•24m ago•0 comments

DeGoogle list with best private alternatives (2026)

https://tuta.com/blog/degoogle-list
1•shaunpud•25m ago•0 comments

Observations from Using Claude Code

http://ternarysearch.blogspot.com/2026/01/observations-from-using-claude-code.html
1•paladin314159•27m ago•0 comments

Ow Zone-Based Pricing Helps Taxi Operators Protect Margins

https://www.siliconslopes.com/c/saas-posts/how-zone-based-pricing-helps-taxi-operators-protect-ma...
1•yelowsoft01•30m ago•0 comments

Europe's next-generation weather satellite sends back first images

https://www.esa.int/Applications/Observing_the_Earth/Meteorological_missions/meteosat_third_gener...
4•saubeidl•35m ago•0 comments

Show HN: Tabularis – A lightweight, developer-focused database management tool

https://tabularis.dev
1•debba•39m ago•0 comments

China executes online scam ringleaders from Myanmar

https://www.abc.net.au/news/2026-01-29/china-executes-online-scam-ring-leaders-from-myanmar/10628...
4•defrost•41m ago•0 comments

What technology takes from us – and how to take it back

https://www.theguardian.com/news/ng-interactive/2026/jan/29/what-technology-takes-from-us-and-how...
1•rognjen•41m ago•0 comments

The creator of Clawd: "I ship code I don't read"

https://www.youtube.com/watch?v=8lF7HmQ_RgY
2•doppp•45m ago•0 comments

Show HN: 1X is a Chrome extension to help with doomscrolling in X

https://1x.kawaicheung.io
1•kiwigod17•46m ago•1 comments

There's no point in learning custom tools, workflows, or languages anymore

https://twitter.com/naval/status/2016343169651409407
1•athultr1997•52m ago•0 comments

Sbt and the Miners of the Wild West

https://eed3si9n.com/sbt-and-the-miners-of-the-wild-west/
1•jmgimeno•52m ago•1 comments

Commission's decision to fine X [pdf]

https://judiciary.house.gov/sites/evo-subsites/republicans-judiciary.house.gov/files/2026-01/ACT_...
2•miohtama•53m ago•0 comments

Get on board with modern banking Taiwan

https://www.taipeitimes.com/News/feat/archives/2026/01/29/2003851387
2•keepamovin•54m ago•1 comments

How shopping chatbots might transform retail

https://www.ft.com/content/9de227a5-2bc1-4730-88b1-dbbb5b559ee8
1•petethomas•56m ago•0 comments