frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

SereneUI; open-source database client built for SereneDB and Postgres

https://github.com/serenedb/serenedb/tree/main/serene-ui
1•thunderbong•58s ago•0 comments

Top Popular React Frameworks Worth Trying in 2026

https://focusreactive.com/blog/react-frameworks-to-use/
1•katyadrozd•1m ago•0 comments

AI, Jobs, and the Next Generation

https://blogs.microsoft.com/on-the-issues/2026/06/10/ai-jobs-and-the-next-generation/
1•azhenley•1m ago•0 comments

Give AI a body – An isolated Linux environment for LLMs

https://github.com/dotojr123/open-infro-agentc
1•iagencia•2m ago•0 comments

Autonomous drones have killed human soldiers for the first time

https://www.newscientist.com/article/2529849-fully-autonomous-drones-have-killed-human-soldiers-f...
2•deadgopher•4m ago•0 comments

Why LLMs still lack taste

https://beyondtheprior.com/post/why-llms-lack-taste/
1•supermdguy•4m ago•0 comments

Show HN: Lightweight Task queue on Erlang/OTP, SQLite-backed, no overengineering

https://github.com/entGriff/ezra
1•ent1c3d•5m ago•0 comments

Wageslave: I made a game where LoCs buy your lunch (and your Prozac)

https://store.steampowered.com/app/4441830/wageslave/
2•stonecauldron•5m ago•0 comments

Lua.ex: Sandboxed Lua 5.3 on the Beam, Built for AI Agents · Lua.ex

https://deflua.com/
1•tortilla•6m ago•0 comments

Highlighters; realistic highlighter-pen marks for web text

https://highlighte.rs/
1•JaceThings•6m ago•0 comments

Close encounters of a deferred kind in Spielberg's conspiracy spectacular

https://www.theguardian.com/film/2026/jun/09/disclosure-day-review-close-encounters-of-a-deferred...
1•wslh•8m ago•0 comments

Show HN: FamilyJar – privacy-first budgeting app for couple

https://familyjar.app/
1•rafalgawlik•8m ago•0 comments

Flights: Agent-Native Ingest in Motherduck

https://motherduck.com/blog/flights-agent-native-ingest/
1•gglanzani•9m ago•0 comments

A €0.01 bank transfer could compromise a banking AI agent

https://blue41.com/blog/how-we-helped-bunq-secure-their-financial-ai-assistant/
3•tvissers•11m ago•1 comments

Complete Guide to Signal's Encryption, Metadata Protection, and Privacy Features

https://aboutsignal.com/signal-security-privacy-guide/
1•nem000•12m ago•0 comments

Show HN: One-command zsh environment setup

https://github.com/PeterGabaldon/LinuxEnv
1•pgj11•13m ago•0 comments

Researchers trigger sleep's restorative effect in parts of the awake brain

https://www.nih.gov/news-events/news-releases/researchers-trigger-sleeps-restorative-effect-parts...
1•gmays•14m ago•0 comments

Claude Fable 5 Free Through June 22 on Pro, Max, Team, and Enterprise Plans

https://claude5.ai/en/news/claude-fable-5-free-access-june-9-22-pro-max-team-enterprise
1•chiply•15m ago•0 comments

China crackdown brings drone users down to earth

https://www.ft.com/content/0294bfda-fe85-495a-b664-9f015b3e11cd
2•bookofjoe•15m ago•1 comments

Built a Chrome extension to give AI exact UI context instead of screenshots

https://qursor.xyz/
1•TheOmkarBirje•16m ago•0 comments

In-browser AI image detection

https://www.quantable.com/ai/detecting-ai-generated-images/
2•jhpacker•16m ago•0 comments

ICE denies having a protester database. A letter to Congress sheds more light

https://www.npr.org/2026/06/10/nx-s1-5843159/ice-protester-database-dhs
3•Jimmc414•16m ago•0 comments

JEP 401: Value Classes and Objects (Preview) JDK 28 July Integration

https://mail.openjdk.org/archives/list/jdk-dev@openjdk.org/message/AIA3O3LHFZ6T7TIPH7KZT4WS4B6U72U5/
2•Tomte•16m ago•0 comments

what if mario had a gun?

https://devz.cl/posts/what-if-mario-had-a-gun/
1•DanielVZ•18m ago•0 comments

Okular – The Universal Document Viewer

https://okular.kde.org
3•smartmic•18m ago•1 comments

Show HN: I generated 235 system docs in a day using GPT-5.5

https://www.paxerp.com/docs
2•robeym•24m ago•0 comments

Generating OG Images in Elixir

https://jola.dev/posts/generating-og-images
1•shintoist•24m ago•0 comments

DCR Core Framework: A Core Library for Building Django Admin Tools

https://yassi.dev/projects/dj-control-room-base/
1•yassi_dev•24m ago•0 comments

World-first: therapy to make cells young again trialled in a person

https://www.nature.com/articles/d41586-026-01836-7
3•mhb•25m ago•0 comments

Bumble bees show spontaneous problem-solving, challenging big-brain assumptions

https://phys.org/news/2026-06-bumble-bees-spontaneous-problem-big.html
2•gmays•26m ago•0 comments