frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•10mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Skip drip emails when recipient has replied in Gmail

1•nishiohiroshi•46s ago•0 comments

California vet clinic warns of AI scam targeting lost dogs

https://ktla.com/news/california/lost-dog-ai-scam-fresno/
2•Bender•2m ago•0 comments

Externalizing Developers' Intuition as Code

https://github.com/elbanic/dev-sentinel
1•elbanic•2m ago•1 comments

Synchronized MIMD Computing [pdf]

https://people.csail.mit.edu/bradley/papers/Kuszmaul94.pdf
1•luu•5m ago•0 comments

Ask HN: Which nickname will President Trump choose for Claude?

2•thomassmith65•5m ago•0 comments

LXD 6.7 Released with AMD GPU Passthrough Support

https://www.phoronix.com/news/LXD-6.7-Released
1•Bender•5m ago•0 comments

Edge Case Poisoning

https://buttondown.com/hillelwayne/archive/edge-case-poisoning/
1•azhenley•6m ago•0 comments

Gnome GitLab Redirecting Some Git Traffic to GitHub for Reducing Costs

https://www.phoronix.com/news/GNOME-GitHub-GitLab-Redirect
1•Bender•7m ago•0 comments

Federal panel behind cancer screening recommendations hasn't met in one year

https://www.nbcnews.com/health/health-news/federal-panel-cancer-screening-recommendations-hasnt-m...
1•brandonb•8m ago•0 comments

Programmers on the Verge of Extinction

https://stevedylan.dev/posts/programmers-on-the-verge-of-extinction/
2•stevedsimkins•14m ago•0 comments

$500K exit approved for Bay Area CEO days before harassment findings surface

https://www.sfgate.com/bayarea/article/exit-bay-area-ceo-harassment-21943048.php
1•randycupertino•14m ago•1 comments

Heart attack deaths are rising in young adults. Here's why

https://www.empirical.health/blog/heart-attacks-rising-young-people/
1•brandonb•16m ago•0 comments

Show HN: Lneto – IEEE802.3/IP/TCP/HTTP in 8kB of RAM in Go

https://github.com/soypat/lan8720
1•soypat•16m ago•0 comments

Ask HN: 2026, where is the best place in the world to create a startup?

2•wewewedxfgdf•16m ago•1 comments

A tool to launch your OpenClaw in just 1 minute

https://clawhost.chat
1•vadimen•19m ago•1 comments

OpenAI is negotiating a deal with The Pentagon

https://fortune.com/2026/02/27/openai-in-talks-with-pentagon-after-anthropic-blowup/
3•doener•19m ago•1 comments

Not Found

https://www.anthropic.com/news/statement-comments-secretary-war
17•surprisetalk•19m ago•3 comments

Super Editor – Atomic file editor with automatic backups (Python and Go)

2•larryste•21m ago•0 comments

USA Designates Anthropic a Supply Chain Risk

https://www.pbs.org/newshour/politics/trump-orders-federal-agencies-to-stop-using-anthropic-tech-...
5•ssutch3•21m ago•1 comments

Show HN: I built a 0-CPU desktop app to track LLM limits,Python/DjangoPyWebView

https://github.com/PeterJFrancoIII/Antigravity-Model-Reset-Timer
2•Viper117•23m ago•0 comments

DeepSeek's Dualpath Paper explained with animations

https://mesuvash.github.io/blog/2026/dualpath/
2•mesuvash•23m ago•0 comments

Golem

https://en.wikipedia.org/wiki/Golem
2•downboots•24m ago•0 comments

Replace Resend dry-run emails with Gmail drafts for manual review

1•nishiohiroshi•24m ago•0 comments

Rubin Observatory has started paging astronomers 800k times a night

https://www.scientificamerican.com/article/rubin-observatory-has-started-paging-astronomers-800-0...
3•Brajeshwar•26m ago•1 comments

Show HN: Crypto volume anomaly scanner – a token at 127x its daily market cap

https://frog03-20494.wykr.es
1•agenthustler•27m ago•0 comments

A star 1,540 times the size of our sun transform into a hypergiant

https://www.space.com/astronomy/stars/astronomers-just-watched-a-star-1-540-times-the-size-of-our...
1•Brajeshwar•27m ago•0 comments

Keen bosses mistakes and a looming threat

https://www.theguardian.com/technology/2026/feb/26/workers-training-ai-to-do-their-jobs
1•Brajeshwar•27m ago•0 comments

The Armchair Historian: Retiring End of 2026 – My Last Year on YouTube [video]

https://www.youtube.com/watch?v=kbrjnNq3aoQ
2•mmarian•27m ago•0 comments

Ask HN: How do I use AI as a tool if some answers are objectivity incorrect?

2•truthbe•28m ago•0 comments

Will AI Replace You? – Roast My Career

https://candidate.perfectly.so/roast
2•z-mach9•31m ago•1 comments