frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Talos OS images are now bit-by-bit reproducible

https://github.com/siderolabs/talos/releases/tag/v1.13.0
1•matesz•1m ago•0 comments

I Use AI in 2026

https://fedepaol.github.io/blog/2026/04/25/how-i-use-ai-in-2026/
1•fedepaol•2m ago•0 comments

Come From

https://wiki.c2.com/?ComeFrom
1•pramodbiligiri•3m ago•0 comments

Steal Claude Code Architecture

https://teamcal.ai/blog/claude-code-architecture
1•rajl•6m ago•0 comments

How to build advanced features for AI chatbots on SSE

https://zknill.io/posts/everyone-said-sse-token-streaming-was-easy/
1•zknill•10m ago•0 comments

Show HN: VibeBrowser – Give your AI agent your real logged-in browser via MCP

https://www.vibebrowser.app/mcp
1•denis4inet•10m ago•0 comments

Show HN: Financial Database API for Vibe Coders

https://xfinlink.com
1•lyonghee97•18m ago•1 comments

Hotta GameDriverX64.sys shipping in Neverness to Everness preload

https://github.com/LaggyTMD/nte-driver-analysis
1•LaggyTMD•19m ago•0 comments

Anthropic Claude Code HERMES.md billing flaw

https://consumerrights.wiki/w/Anthropic_Claude_Code_HERMES.md_billing_flaw
1•Palmik•20m ago•0 comments

Scraping 241 UK council planning portals – 2.6M decisions so far

20•mebkorea•25m ago•14 comments

Show HN: BeVisible.app - Blog that runs itself

https://www.bevisible.app
2•evanyang•28m ago•0 comments

Xiaomi MiMo Orbit: 100T Token Grant for Builders

https://100t.xiaomimimo.com/
1•whtsky•29m ago•0 comments

SwiftBash: Pure-Swift, sandboxed bash interpreter

https://github.com/cocoanetics/swiftbash
2•ingve•29m ago•0 comments

Text Is the New Binary

https://andreabaccega.com/blog/text-is-the-new-binary/
2•veke87•32m ago•0 comments

Bugs in the original 1977 Cave Adventure Fortran source

https://colossalcave.cc/bugs.php
2•ultra-nick•35m ago•1 comments

A case report of someone who self-managed Fatal Familial Insomnia

https://pmc.ncbi.nlm.nih.gov/articles/PMC1781276/
1•abinaryquibit•35m ago•1 comments

Asimov v1: Open-Source Humanoid Robot

https://github.com/asimovinc/asimov-v1
1•Philipp2398•36m ago•0 comments

I built a coach for people who are tired of being yelled at by Stockfish

https://chessmentorai.com/en
1•sepiropht•37m ago•0 comments

Set a Meeting Budget

https://alexhans.github.io/posts/meeting-budget.html
2•alexhans•40m ago•1 comments

Ask HN: When might we not have to do laundry or fold clothes or cook

2•samarthv•42m ago•0 comments

Google signs classified AI deal with Pentagon

https://www.reuters.com/technology/google-signs-classified-ai-deal-with-pentagon-information-repo...
5•afshinmeh•43m ago•2 comments

The 278k language running 20% of the Internet

https://www.ismatsamadov.com/blog/lua-278k-language-running-the-internet
1•ismats•44m ago•0 comments

Unitree G1 humanoid robot roller skating [video]

https://www.youtube.com/watch?v=srPz8TRpZ_8
1•nathanh4903•46m ago•0 comments

Humanoid robots to become baggage handlers in Japan airport experiment

https://www.theguardian.com/world/2026/apr/28/humanoid-robots-baggage-handlers-japan-airports
3•calcifer•51m ago•0 comments

Japan awakens to Radio Taiso exercise tradition. One face of country's longevity

https://apnews.com/article/radio-taiso-c4faaf9abb045b3f25d3fda2779943bb
2•petethomas•52m ago•0 comments

The Fallen Apple

https://mattgemmell.scot/the-fallen-apple/
2•mpweiher•52m ago•0 comments

Show HN: An Agent-First Collaboration Platform Inspired by Karpathy's AgentHub

https://community.computer/
2•lftherios•53m ago•0 comments

Will AI destroy the economy? [video]

https://www.youtube.com/watch?v=DBvrwWoyYQM
1•aluket•54m ago•0 comments

Elon Musk and Sam Altman are going to court over OpenAI's future

https://www.technologyreview.com/2026/04/27/1136466/elon-musk-and-sam-altman-are-going-to-court-o...
3•joozio•57m ago•0 comments

There's no such thing as the petrodollar

https://www.ft.com/content/be345914-7b4b-4264-bcbd-6e5e33b798c7
2•helsinkiandrew•57m ago•0 comments