frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•7mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Async DNS

https://flak.tedunangst.com/post/async-dns
1•todsacerdoti•1m ago•0 comments

Data Center End of Life – Atlassian

https://www.atlassian.com/licensing/data-center-end-of-life
1•andrewSC•1m ago•0 comments

Show HN: Dbxlite – Query 100M+ rows in a browser tab, no install

https://sql.dbxlite.com/?share=gist:f0377982ccd68ac7f61a7faef8ff513e&run=true
1•hfmsio•2m ago•0 comments

Finding Alignment by Visualizing Music in Rust

https://positron.solutions/articles/finding-alignment-by-visualizing-music
1•positron26•3m ago•0 comments

Show HN: A week of progress making a game in Claude Code without any coding

https://play.wrestlejoy.com/static/game/
1•AndyNemmity•4m ago•1 comments

HarisLab: Free online tools for developers and students – fast, ad-free

https://harislab.tech/
1•Haris18•8m ago•0 comments

Show HN: SlimStorage – Self-hosted back end key/values, events store

https://github.com/kibotu/SlimStorage
1•Cloudgazer3d•9m ago•0 comments

Show HN: A zero-to-hero, spaced-repetition guide to WebGL2 and GLSL

https://github.com/GregStanton/webgl2-glsl-primer
1•HigherMathHelp•9m ago•1 comments

America's Betting Craze Has Spread to Its News Networks

https://www.newyorker.com/news/the-lede/americas-betting-craze-has-spread-to-its-news-networks
3•FinnLobsien•10m ago•1 comments

Photographing Professional Hockey with the Sigma 135mm f/1.4 Art

https://petapixel.com/2025/12/06/photographing-professional-hockey-with-the-sigma-135mm-f-1-4-art/
1•PaulHoule•10m ago•0 comments

Instacart reaches into your pocket and lops a third off your dollars

https://pluralistic.net/2025/12/11/nothing-personal/
3•hn_acker•11m ago•1 comments

Show HN: Caliper – iOS App Size Analyzer

https://github.com/kibotu/caliper
1•Cloudgazer3d•11m ago•0 comments

Make a free picture book service

https://c2story.com
1•jeyzolo•12m ago•1 comments

AI-Mediated Decisions and the Emerging Evidentiary Control Gap

https://zenodo.org/records/17914417
1•businessmate•15m ago•1 comments

Rescale your Hetzner VPS and save money

https://j11g.com/2025/12/12/rescale-your-hetzner-vps-and-save-money/
1•speckx•15m ago•0 comments

Senator endorses discredited book that claims chemical treats autism, cancer

https://www.propublica.org/article/ron-johnson-wisconsin-chlorine-dioxide-pierre-kory-endorsement
6•duxup•16m ago•0 comments

Robotaxis offer a path toward smarter and fairer urban mobility

https://bigthink.com/books/robotaxis-urban-mobility/
1•Brajeshwar•17m ago•1 comments

Researchers uncover clues to mysterious origin of famous Hjortspring boat

https://www.cnn.com/2025/12/11/science/hjortspring-plank-boat-origin
1•Brajeshwar•18m ago•0 comments

The Death of the Scientist

https://www.noemamag.com/the-death-of-the-scientist/
3•Brajeshwar•18m ago•0 comments

The FDA Rarely Forces Manufacturers to Recall Dangerous Medical Devices

https://www.propublica.org/article/fda-defective-device-recalls-gao-report
2•hn_acker•18m ago•1 comments

Secondary school maths showing that AI systems don't think

https://www.raspberrypi.org/blog/secondary-school-maths-showing-that-ai-systems-dont-think/
2•zdw•20m ago•0 comments

Our History – DigiPen

https://www.digipen.edu/about/our-history
1•debo_•22m ago•0 comments

China leads research in 90% of technologies

https://www.nature.com/articles/d41586-025-04048-7
5•SirHumphrey•23m ago•0 comments

One Hundred Thousand Billion Poems

https://www.bevrowe.info/Internet/Queneau/Queneau.html
1•nickwrb•23m ago•0 comments

Show HN: Bookmarker – Save links, organize your knowledge

https://www.bookmarker.cc/
1•kaizenb•23m ago•0 comments

Is Entertainment Discovery Fundamentally Broken?

2•nicola_alessi•26m ago•2 comments

Show HN: Help validate startup ideas with synthetic customer interviews

https://market-echo.vercel.app/
3•emarboeuf•28m ago•1 comments

The EPA Was Considering a Lead Cleanup in Omaha. Then Trump Shifted Guidance

https://www.propublica.org/article/omaha-nebraska-lead-superfund-epa-trump
4•hn_acker•29m ago•1 comments

Are we ready to age longer? Study says no

https://longevity.technology/news/are-we-ready-to-age-longer-study-says-no/
1•Bender•29m ago•0 comments

Show HN: AI system 60x faster than ChatGPT – built by combat vet with no degree

3•thebrokenway•29m ago•2 comments