frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•9mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Microsoft Loses $400B After AI Spending Backfires [video]

https://www.youtube.com/watch?v=ZcIWx_dW0Jo
1•cable2600•59s ago•0 comments

Learn how to make mechanical keyboard PCBs

https://wiki.ai03.com/books/pcb-design
1•tripdout•4m ago•0 comments

Show HN: Mojic – A C code obfuscator and encryption tool for source protection

https://amit.is-a.dev/mojicDocs
1•notamitgamer•7m ago•1 comments

Show HN: Githrun – Run Python Scripts from GitHub URLs and VS Code Extension

https://amit.is-a.dev/githrun
1•notamitgamer•8m ago•0 comments

Show HN: Ship packages without exposing your real address

https://shipto.me
1•thesecretceo•10m ago•1 comments

Colocation Evaluation Framework for AI Infrastructure (2026)

https://syaala.com/blog/colocation-vs-modular-vs-traditional-2026
1•jaynamburi•12m ago•0 comments

Show HN: Dwrite.me A minimalist writing space that blocks copypaste to fight AI

https://dwrite.me
1•ketutdana•16m ago•0 comments

Amazon Basics vs. SanDisk: I Cut Them Open [video]

https://www.youtube.com/watch?v=Wir1jBqvQEs
1•iamflimflam1•20m ago•0 comments

SOUL.md

https://soul.md/
2•tosh•23m ago•0 comments

On Culmination and Not yet Turning Thirty

https://briankitano.com/essays/on-culmination/
1•bkitano19•24m ago•0 comments

Show HN: Monitui – drag 'n drop monitor config for hyprland

https://github.com/nathaniel-fargo/monitui
1•theParadox42•27m ago•0 comments

Designing MCP tool schemas that LLMs understand

1•runai•30m ago•0 comments

What the mole rat can teach us about aging

https://worksinprogress.co/issue/the-perks-of-being-a-mole-rat/
1•MintyPyro•30m ago•0 comments

Why do rich people live longer?

https://www.empirical.health/blog/rich-people-live-longer-hims-superbowl/
2•brandonb•32m ago•1 comments

Show HN: Mission Plus – Clear Titles and App Logos in macOS Mission Control

https://trystartup.com
1•boneyao•34m ago•0 comments

The craft of screen graphics and movie user interfaces (2014)

https://www.pushing-pixels.org/2014/04/04/the-craft-of-screen-graphics-and-movie-user-interfaces-...
1•isnotchicago•40m ago•0 comments

TSMC to make advanced AI semiconductors in Japan

https://apnews.com/article/semiconductors-tsmc-japan-taiwan-ai-11256f2bfde73ca23d08331ad138d6d5
23•dev_tty01•41m ago•3 comments

Show HN: Physical swipe typing for your computer

https://github.com/ZimengXiong/swipeType
2•zimengx•48m ago•0 comments

Show HN: ShapeGuard – Shape Contracts for NumPy and Jax

1•jayendra13•49m ago•0 comments

Claude's C Compiler vs. GCC

https://harshanu.space/en/tech/ccc-vs-gcc/
4•unchar1•49m ago•1 comments

Quilt

https://www.quilt.sh/
1•handfuloflight•50m ago•0 comments

Show HN: Agent VCR – Record, replay, and diff MCP server interactions

https://github.com/Jarvis2021/agent-vcr
1•pramodvoola•53m ago•0 comments

Show HN: I Let AI Agents Train Their Own Models. Here's What Happened

https://hamzamostafa.com/blog/agents-training-their-own-models
1•Hamza-Mos•54m ago•0 comments

SecretSpec 0.7: Declarative Secret Generation – devenv

https://devenv.sh/blog/2026/02/09/secretspec-07-declarative-secret-generation/
2•todsacerdoti•57m ago•0 comments

Show HN: Vrhi: AI Slop DX9/DX11-Style Immediate API on Vulkan

https://github.com/hypernewbie/vrhi
1•hypernewbie•1h ago•0 comments

Ring's Search Party helps reunite more than one lost dog a day

https://www.aboutamazon.com/news/devices/ring-search-party-for-dogs-united-states-missing-pets
1•dsr12•1h ago•0 comments

Americans want heat pumps – but high electricity prices may get in the way

https://theconversation.com/americans-want-heat-pumps-but-high-electricity-prices-may-get-in-the-...
7•PaulHoule•1h ago•5 comments

Caffeine content for coffee, tea, soda and more

https://www.mayoclinic.org/healthy-lifestyle/nutrition-and-healthy-eating/in-depth/caffeine/art-2...
4•mooreds•1h ago•0 comments

Researchers find brain mechanism behind 'flashes of intuition'

https://medicalxpress.com/news/2026-02-brain-mechanism-intuition.html
2•Gaishan•1h ago•0 comments

We Improved Rails Response Times by 87% – Fast Retro Blog

https://fastretro.app/blog/how-we-improved-rails-response-times-by-87-percent
1•doppp•1h ago•0 comments