frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Show HN: TogetLink – High-intent professional network built with Next.js

https://togetlink.com/en
1•Aeternexus•32s ago•0 comments

Meta forced engineers into AI training. Now it's giving some a way out

https://www.businessinsider.com/meta-lets-engineers-leave-ai-training-unit-after-mass-reassignmen...
1•samaysharma•33s ago•0 comments

Show HN: Top' for Redis Using eBPF

https://github.com/yeet-src/redissnoop
1•ok_major_9889•1m ago•0 comments

Direct I/O for Cassandra Compaction: Cutting p99 Read Latency by 5x

https://lightfoot.dev/direct-i-o-for-cassandra-compaction-cutting-p99-read-latency-by-5x/
1•tanelpoder•1m ago•0 comments

Repositioning Retail for the AI Era

https://www.technologyreview.com/2026/06/25/1137848/repositioning-retail-for-the-ai-era/
1•joozio•1m ago•0 comments

The State of the AI Economy

https://www.exponentialview.co/p/the-state-of-the-ai-economy
1•hunglee2•3m ago•0 comments

Updated Xbox Console Prices

https://news.xbox.com/en-us/2026/06/25/xbox-console-price-update/
1•0xedb•3m ago•0 comments

Cellebrite said it cut off Russia, but Russia used its tools anyway

https://techcrunch.com/2026/06/25/cellebrite-said-it-cut-off-russia-but-russia-used-is-tools-anyway/
1•Brajeshwar•3m ago•0 comments

Show HN: Better PDF Presentations (+Typst)

https://presio.xyz
1•armstrongb•5m ago•0 comments

How we made WINDOW JOIN parallel and vectorized

https://questdb.com/blog/window-join-parallel-vectorized/
1•tosh•7m ago•0 comments

AOL was down (1996) (2026)

https://ngrok.com/blog/aol-was-down-1996
1•birdculture•7m ago•0 comments

Carl de Marcken: Inside Orbitz (2001)

https://www.paulgraham.com/carl.html
1•wglb•8m ago•0 comments

Genetic diversity of late Neanderthals in northwestern Europe

https://www.nature.com/articles/s41586-026-10625-1
1•Jimmc414•8m ago•0 comments

The AI Memory Problem Nobody Is Incentivized to Solve

https://www.indiehackers.com/post/the-ai-memory-problem-nobody-is-incentivized-to-solve-9c294bdcaa
1•metaopai•9m ago•0 comments

Every Homo Naledi we know of is female, and the implications are fascinating

https://arstechnica.com/science/2026/06/every-homo-naledi-we-know-of-is-female-and-the-implicatio...
1•Jimmc414•10m ago•0 comments

New credit card sized tracking label could help solve rising cargo theft

https://techcrunch.com/2026/06/24/this-new-tracking-label-could-help-solve-cargo-theft/
1•Vaslo•11m ago•0 comments

The U.S. Strongarms Polestar Out of the American EV Market

https://insideevs.com/news/799796/polestar-exits-us-market-authorization-denied/
2•testing22321•11m ago•1 comments

Show HN: Engineer AI system full DataIQ pipeline choice u model Start training

https://zunagen.com/
1•Ouasif•11m ago•0 comments

LAUSD bans screen time before second grade

https://www.latimes.com/california/story/2026-06-23/lausd-strict-school-screen-time-limits
1•cfowles•12m ago•0 comments

First-ever Code Red alert issued for heat in the Netherlands

https://nltimes.nl/2026/06/25/first-ever-code-red-alert-issued-heat-netherlands-40degc-tomorrow
2•bill38•14m ago•0 comments

DuckDB isn't just fast (2024)

https://csvbase.com/blog/6
2•tosh•15m ago•0 comments

Frankenstein Was a Warning, Not a Blueprint for AI

https://ideatrash.net/2026/04/frankenstein-was-a-warning-not-a-blueprint-for-ai.html
2•speckx•16m ago•0 comments

Framework's 10G Ethernet module exposes USB-C's complexity

https://www.jeffgeerling.com/blog/2026/framework-10g-ethernet-module-usb-c-complexity/
1•Brajeshwar•16m ago•0 comments

Show HN: TreasuryBench – an open benchmark for personal-finance AI advice

https://github.com/Treasury-Technologies-Inc/treasurybench
1•juneadkhan•19m ago•0 comments

Codex Security Plugin Quickstart

https://developers.openai.com/codex/security/plugin
4•vantareed•20m ago•0 comments

Goalkeepers beware: Trionda World Cup ball hits 'crisis' point at certain speed

https://www.theguardian.com/football/2026/jun/25/goalkeepers-beware-trionda-world-cup-ball-hits-c...
1•prmph•20m ago•1 comments

Bankruptcy Capitals of America: Where US Small Businesses Are Closing Fastest

https://samslist.com/blog/bankruptcy-capitals-of-america
2•eatonphil•20m ago•0 comments

Of Cats and Women

https://anthrozoology.acadiasi.org/wp-content/uploads/2022/11/Anthrozoology.pdf#page=159
1•jruohonen•21m ago•0 comments

The KIDS Act Would Require Age Checks to Get Online

https://www.eff.org/deeplinks/2026/06/kids-act-would-require-age-checks-get-online
1•iamnothere•21m ago•0 comments

British Police Built a Sprawling Crime-Prediction Machine

https://www.wired.com/story/british-police-built-a-sprawling-crime-prediction-machine-some-result...
3•g0xA52A2A•22m ago•1 comments