frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•9mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Trump admin is pulling supercomputers out of key weather and climate center

https://www.cnn.com/2026/02/13/weather/trump-colorado-lab-ncar-supercomputer-climate
6•computerliker•4m ago•0 comments

Minimum Wages and the Rise of the Robots

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6184798
1•paulpauper•7m ago•0 comments

Chinese robot boxing draws crowds in San Francisco

https://restofworld.org/2026/chinese-robot-boxing-unitree-rek/
1•donohoe•8m ago•0 comments

Terence Tao makes –$700k at UCLA

https://twitter.com/pitdesi/status/2020365826671669438
1•paulpauper•8m ago•0 comments

Show HN: Threat Radar – Live cyber threat intelligence dashboard

https://radar.offseq.com/
197•offseq•9m ago•0 comments

How to Write a Clear Math Paper: Some 21st Century Tips [pdf]

https://www.math.ucla.edu/~pak/papers/how-to-write1.pdf
1•paulpauper•9m ago•0 comments

£189,486,935,770 in Bitcoin. Lost Forever

https://btcgraveyard.com/
1•koqoo•10m ago•1 comments

In Trump vs. Trump administration, is Trump sure to win?

https://san.com/cc/in-trump-v-trump-administration-is-trump-sure-to-win/
2•SilverElfin•12m ago•1 comments

Arborium is AI slopware and should not be trusted

https://ewie.online/posts/20260214-arborium-is-ai-slopw/
1•todsacerdoti•13m ago•0 comments

Saudade

https://en.wikipedia.org/wiki/Saudade
2•neom•15m ago•0 comments

The Project 10

https://pink-delicate-dinosaur-221.mypinata.cloud/ipfs/bafkreiduok44rnwnvnoukyzlvv2wups7uaolfyvoe...
1•KaoruAK•16m ago•0 comments

Crypto Asset Tracing Specialists – Intelligence Cyber Wizard

1•Robertabert•16m ago•0 comments

Evolving Git for the Next Decade

https://lwn.net/SubscriberLink/1057561/bddc1e61152fadf6/
1•AndrewDucker•17m ago•0 comments

73% of AI Search Keywords Don't Exist on Google

https://www.kwrds.ai/blog/ai-search-keyword-research
1•seo_god•18m ago•1 comments

Windmill Scene

https://en.wikipedia.org/wiki/Windmill_scene
1•iNic•18m ago•0 comments

Mskql – AI driven adversarial development

https://martinsk.github.io/mskql/
1•mkristiansen•19m ago•0 comments

Micro Front Ends: When They Make Sense and When They Don't

https://lukasniessen.medium.com/micro-frontends-when-they-make-sense-and-when-they-dont-a1a06b726065
1•birdculture•24m ago•0 comments

BLang LLVM-based B Compiler

https://github.com/wgibbs-rs/blang
1•enz•26m ago•0 comments

Show HN: CLI chat client for OpenAI-comp APIs with workspace and MCP support

https://github.com/skorotkiewicz/undead
1•modinfo•26m ago•0 comments

BadSMTP — the reliably unreliable SMTP server

https://badsmtp.com
1•Arnt•28m ago•0 comments

Dan Norris – How to replace your bookkeeper with AI

https://dannorris.me/how-to-replace-your-bookkeeper-with-ai/
1•rmason•33m ago•0 comments

Context Is King

https://twitter.com/itsurboyevan/status/2021987783225524566
1•rmason•34m ago•0 comments

Show HN: Npx Claude-traces, visualizer for Claude Code/Agent SDK traces

https://claudetraces.dev/
2•hahawhatsgood•34m ago•0 comments

Subreddit collapses as OpenAI retires GPT-4o and terminates dozens of AI lovers

https://old.reddit.com/r/SubredditDrama/comments/1r4qehk/most_of_rboyfriendisai_collapses_as_the_...
4•sph•35m ago•1 comments

Microsoft AI chief confirms plan to ditch OpenAI

https://www.windowscentral.com/artificial-intelligence/microsoft-confirms-plan-to-ditch-openai-as...
5•sampo•36m ago•0 comments

Show HN: Auto-Layouting ASCII Diagrams

https://github.com/switz/box-of-rain
1•switz•37m ago•1 comments

New repository settings for configuring pull request access

https://github.blog/changelog/2026-02-13-new-repository-settings-for-configuring-pull-request-acc...
2•do_not_redeem•42m ago•0 comments

The new app to avoid homeless people [video]

https://www.youtube.com/watch?v=XkTtzzbXUHw
2•nirkalimi•43m ago•1 comments

Show HN: Draw a polygon on Google Maps and bulk-export matching places to CSV

https://github.com/rbbydotdev/mapthing
1•rbbydotdev•44m ago•0 comments

You Can't Price per Outcome If You Don't Know Your Cost per Outcome

https://botanu.ai/blog/outcome-pricing-needs-outcome-cost
1•deborahjacob•48m ago•0 comments