frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Isolation Is the New Runtime

https://cyrusradfar.com/thoughts/sandboxing-from-vms-to-agents
1•cyrusradfar•53s ago•0 comments

A new pkg.go.dev API for Go

https://opensource.googleblog.com/2026/06/a-new-pkggodev-api-for-go.html
1•rcy•1m ago•0 comments

Novo Nordisk reports cyberattack as UK gives Wegovy pill the nod

https://www.theregister.com/security/2026/06/12/novo-nordisk-says-hackers-stole-clinical-trial-da...
1•jruohonen•4m ago•0 comments

Show HN: Skill for your agent to visualize your gbrain and Obsidian

https://github.com/vladignatyev/brain-map-skill
1•v_ignatyev•4m ago•0 comments

Weaker bonds make for more impact-resistant polymers

https://newatlas.com/materials/weaker-bonds-impact-resistant-polymers/
1•breve•5m ago•0 comments

Yoti does not report GrapheneOS users to the authorities

https://www.yoti.com/blog/yoti-does-not-report-grapheneos-users-to-the-authorities/
2•Cider9986•6m ago•0 comments

Musk's War on European Democracy

https://ecfr.eu/article/musks-war-on-european-democracy-how-to-open-up-x-and-fight-back/
3•jruohonen•7m ago•0 comments

Estimating No-Cot Task-Completion Time Horizons of Frontier AI Models

https://www.lesswrong.com/posts/SieLowPgNgRSPGhFw/estimating-no-cot-task-completion-time-horizons...
1•kqr•10m ago•0 comments

Norway to deploy largest fleet of 'flying' electric ferries

https://candela.com/newsroom/norway-to-deploy-worlds-largest-fleet-of-flying-electric-ferries/
1•robin_reala•12m ago•0 comments

Looksy – we research and contact vendors so you don't have to

https://looksy.fyi/blogs/blog/hidden-cost-of-doing-your-own-research
2•sean_looksy•20m ago•0 comments

Dwile flonking returns, but what on Earth is it?

https://www.bbc.co.uk/news/articles/c621z76yvy1o
1•zeristor•23m ago•1 comments

Show HN: Visualize an Obsidian/gbrain vault as an interactive graph and timeline

https://vladignatyev.github.io/brain-map-skill/
2•v_ignatyev•23m ago•1 comments

Parasitic Computing

https://en.wikipedia.org/wiki/Parasitic_computing
1•the-mitr•26m ago•0 comments

lmcli v0.7.0 – TUI harness with smooth performance up to 1M context

https://codeberg.org/mlow/lmcli
1•wolttam•28m ago•1 comments

ArkVault – Per-user ZFS snapshots for managed Nextcloud hosting

https://arkdisk.com/
1•ChristopherArk•32m ago•0 comments

Not Your Weights, Not Your Workflow (Claude Fable 5 Export Ban)

https://thecoder.io/blog/not-your-weights
4•pixelhed•32m ago•2 comments

Map Clustering Is Not My Favorite

https://blog.greg.technology/2026/06/12/map-clustering-is-not-my-favorite.html
1•gregsadetsky•35m ago•0 comments

Blog of http://archive.today/

https://lj.rossia.org/users/archive_today/
1•Cider9986•40m ago•1 comments

Fisher Catches and Releases Great White Shark in Massachusetts [video]

https://www.youtube.com/watch?v=yFIcUT8Q1RU
1•aquir•49m ago•0 comments

Leaving Mozilla

https://blog.unitedheroes.net/5751
2•martey•51m ago•0 comments

Show HN: GripLock – walk in real life to conquer terrain

https://griplock.alephz.com/
2•ishener•52m ago•0 comments

Harness engineering for coding agent users

https://martinfowler.com/articles/harness-engineering.html
2•pramodbiligiri•55m ago•0 comments

Russian mathematician wanted on terrorism charges detained in Armenia

https://caliber.az/en/post/russian-mathematician-wanted-on-terrorism-charges-detained-in-armenia
2•spzx•55m ago•1 comments

Shepherd's Dog: A Game by the Most Dangerous AI Model

https://koenvangilst.nl/lab/claude-fable-shepherds-dog
14•vnglst•1h ago•8 comments

Show HN: FinMind AI – Like Bloomberg Terminal Meets ChatGPT

https://finmindai-moneyverse.vercel.app/
1•heroboy•1h ago•0 comments

Ask HN: What features do you miss in Google Docs for desktop?

2•mci•1h ago•1 comments

Too many people are shockingly bad at prioritisation

https://economist.com/business/2026/06/11/too-many-people-are-shockingly-bad-at-prioritisation
1•andsoitis•1h ago•0 comments

Mise System Packages

https://mise.jdx.dev/system-packages/
4•crbelaus•1h ago•0 comments

The Touch of God

http://bryanhu.com/blog/posts/the-touch-of-god/
1•thatxliner•1h ago•0 comments

How Musk's tactics left investors clamoring for SpaceX stock and ignoring risks

https://www.reuters.com/legal/transactional/how-musks-tactics-left-investors-clamoring-spacex-sto...
5•1vuio0pswjnm7•1h ago•1 comments