frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

JetBrains Air: Agentic Development Environment

https://air.dev/?source=google&medium=cpc&campaign=amer_en_us_air_wave1_mar_google_search&term=mu...
1•jtanderson•20s ago•0 comments

A New Interpretation of the Rise of Intelligence: Falling CO2 Levels Was Key

https://senecaeffect.substack.com/p/a-new-interpretation-of-the-rise
1•Qem•45s ago•1 comments

Seekstone – a filesystem-direct Obsidian MCP server for Claude

https://seekstone.dev/
1•shaqmughal•57s ago•0 comments

Insider Threat Detection Platform · Streamlit

https://enterprise-insider-threat-detection-platform-mvmfzoaxdycuzkvz7.streamlit.app
1•muhammadibrar66•1m ago•0 comments

I made a free MCP server so your Claude can read Claude/Anthropic news and RAG

https://claudenews.online
1•BaguettePwnM•1m ago•0 comments

Ask HN: How close are we to local LLM modems being

1•AbstractH24•3m ago•0 comments

Go-Harness

https://github.com/Protocol-Lattice/go-harness
1•raezil•5m ago•1 comments

Notification of Data Security Incident – Texas Parks and Wildlife

https://tpwd.texas.gov/about/notification-of-data-security-incident
1•geox•7m ago•0 comments

Show HN: Monolisa v3 – a typeface for developers and creatives

https://www.monolisa.dev/
2•bebraw•8m ago•0 comments

MiniPixelFont Generator

https://xem.github.io/miniPixelFont/js13k.html
2•javatuts•9m ago•0 comments

Tabulator – interactive JavaScript tables and data grids

https://github.com/tabulator-tables/tabulator
1•javatuts•10m ago•0 comments

Show HN: Spookling – An iPhone AI Agent for WhatsApp and Calendar

1•salman10•11m ago•0 comments

Putin Has a New Tool to Monitor Russians (2025)

https://www.theatlantic.com/international/archive/2025/10/russia-super-app-max/684524/
1•dotcoma•11m ago•0 comments

The World Now Has More Bot Traffic Than Human Traffic

https://newsletter.signoz.io/p/the-world-now-has-more-bot-traffic
2•birdculture•12m ago•0 comments

Democracy Needs Friction to Function

https://www.noemamag.com/democracy-needs-friction-to-function/
2•hamburgererror•12m ago•0 comments

Genuinely, my all-time favourite image: Mamenchisaurus hochuanensis

https://svpow.com/2026/06/04/genuinely-my-all-time-favourite-image-mamenchisaurus-hochuanensis/
1•surprisetalk•12m ago•0 comments

We should vaccinate wild animals

https://worksinprogress.co/issue/why-we-should-vaccinate-wild-animals/
2•surprisetalk•13m ago•0 comments

Nvidia Halos

https://www.nvidia.com/en-us/ai-trust-center/halos/autonomous-vehicles/
11•ilreb•15m ago•0 comments

All time best of Split Depth GIFs

https://old.reddit.com/r/SplitDepthGIFS/top/
1•mxfh•16m ago•0 comments

Show HN: A voxel editor for decorating a home for a Tamagotchi-like creature

https://kamio.ai/studio
2•eric_khun•16m ago•0 comments

Two Singaporean brothers turns unsolvable math into post-quantum encryption

https://startupfortune.com/two-singaporean-brothers-turned-unsolvable-math-into-southeast-asias-f...
1•insanetech•20m ago•0 comments

Show HN: Ziex, a Zig web framework reaching its first release

2•nurulhudaapon•20m ago•0 comments

Moebius: 0.2B image inpainting model with 10B-level performance

https://hustvl.github.io/Moebius/
2•DSemba•20m ago•0 comments

Trump unveiled Qatar's gifted Air Force One this week

https://respublica.media/en/trump-unveiled-qatars-gifted-air-force-one-this-week/
3•Veldoran•20m ago•0 comments

Tesla driver says it was on Autopilot before fatal Texas home crash

https://electrek.co/2026/06/20/tesla-autopilot-katy-texas-home-crash-woman-killed/
3•croes•20m ago•0 comments

AI Is Boosting Productivity at Home – But Not Equally

https://www.marshall.usc.edu/news/new-research-reveals-ai-is-boosting-productivity-at-home--but-n...
1•giuliomagnifico•20m ago•0 comments

Alan Greenspan's Essay: "Gold and Economic Freedom"

https://ritholtz.com/2008/11/gold-and-economic-freedom-by-alan-greenspan/
1•bhouston•21m ago•0 comments

Ask HN: Is there still value in making apps?

1•darth-pixit•21m ago•2 comments

AI effect: People are taking up skills for no money, just to feel human

https://siliconcanals.com/m-as-ai-eats-into-paid-creative-work-people-are-taking-up-the-same-skil...
1•achow•21m ago•1 comments

China Became an Energy Superpower [video]

https://www.youtube.com/watch?v=xCUASv01bVY
1•hunglee2•23m ago•0 comments