frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Canada says AI strategy will help create 250k jobs, boost GDP by 3%

https://www.reuters.com/business/world-at-work/canada-says-ai-strategy-will-help-create-250000-jo...
1•thm•40s ago•0 comments

iSCSI vs. NVMe/TCP: The Storage Showdown for Red Hat OpenShift Virtualization

https://developers.redhat.com/articles/2026/06/04/iscsi-vs-nvmetcp-ultimate-storage-showdown-red-...
1•tanelpoder•2m ago•0 comments

Flying High on Impunity

https://georgiebc.wordpress.com/2026/06/01/flying-high-on-impunity/
1•ortr•2m ago•0 comments

Nvidia Nemotron 3 Ultra

https://research.nvidia.com/labs/nemotron/Nemotron-3-Ultra/
1•wavesound•2m ago•0 comments

Blanket: A library for writing deterministic tests of multithreaded Python code

https://github.com/larryhastings/blanket
1•AdilZtn•5m ago•0 comments

K Slices, K Dices

https://beyondloom.com/blog/slicedice.html
1•tosh•9m ago•0 comments

Brave Origin

https://brave.com/origin/
2•berlianta•9m ago•1 comments

Overview of Canada's National Artificial Intelligence Strategy: AI for All

https://ised-isde.canada.ca/site/ised/en/artificial-intelligence-ecosystem/overview-canadas-natio...
1•BiraIgnacio•9m ago•0 comments

INL discovers new behavior in plutonium that could reshape nuclear science

https://inl.gov/feature-story/a-quantum-leap-inl-discovers-new-behavior-in-plutonium-that-could-r...
1•SVI•9m ago•0 comments

The longer it has taken, the longer it will take (2015)

https://www.johndcook.com/blog/2015/12/21/power-law-projects/
1•downbad_•9m ago•0 comments

After 20 years, scientists shrink a powerful laser onto a chip

https://www.sciencedaily.com/releases/2026/06/260604044240.htm
1•SVI•10m ago•0 comments

ColombiaEscoge – Plataforma para que los colombianos voten con información

https://www.colombiaescoge.com/
2•mcormik•13m ago•1 comments

Nvidia Nemotron 3 Ultra Is Live

https://blogs.nvidia.com/blog/nvidia-gtc-taipei-computex-2026-news/
2•pretext•13m ago•0 comments

Scala Was an Experiment That Changed Programming – Interview with Martin Odersky

https://www.youtube.com/watch?v=Xn_YpUtXWT4
1•theanonymousone•17m ago•0 comments

OpenAI CEO Sam Altman admits AI token costs are becoming 'an issue'

https://www.tomshardware.com/tech-industry/artificial-intelligence/openai-ceo-sam-altman-admits-a...
5•speckx•18m ago•0 comments

Enshittification, Despotification, and the Open Internet

https://www.liberalism.org/p/enshittification-despotification-and-the-open-internet
2•mooreds•19m ago•0 comments

Why chatbot AI costs vary 20x for the same job: pricing model, not the tool

https://wexio.io/blog/best-chatbot-small-business
2•Puvvl•19m ago•0 comments

I forked Bettercanvas as a free and open source extension and published it

https://chromewebstore.google.com/detail/canvasrefined/ihienfbdfdamhmhhiokjnjmpjgbenedg
2•GuySan•20m ago•1 comments

Azure Linux 4.0

https://techcommunity.microsoft.com/blog/linuxandopensourceblog/announcing-azure-linux-4-0-purpos...
2•madspindel•20m ago•0 comments

Anina: The discovery infrastructure for the next iconic brands

https://anina.app/
1•Marcelorz•21m ago•1 comments

Show HN: Zerostack, an open coding agent optimized for memory footprint

https://gi-dellav.github.io/zerostack/
2•gidellav•21m ago•0 comments

DeepSWE results are unreliable – 3/3 DSv4 "failed" tasks solved with same model

https://github.com/datacurve-ai/deep-swe/issues/21
1•theanonymousone•22m ago•0 comments

Show HN: Recursi – self-improving LLM-connected coding environment

https://recursi.dev/
1•robbrown451•23m ago•1 comments

Trump plans $700M in new coal support

https://www.reuters.com/legal/litigation/trump-plans-700-million-new-coal-support-white-house-off...
3•JumpCrisscross•23m ago•1 comments

Notes about a random free project I did 30 days ago (yt video transcriptions)

1•cristyg0101•24m ago•0 comments

No Use of AI Is Ethical

https://efturnip.substack.com/p/no-use-of-ai-is-ethical
4•dopple•24m ago•0 comments

Google Search adding profile pages for websites and creators

https://9to5google.com/2026/06/04/google-search-profiles/
1•geox•24m ago•0 comments

Browser-Based OAuth Client: The architecture you shouldn't be using

https://fusionauth.io/blog/browser-based-oauth-client-security-architecture
1•mooreds•25m ago•0 comments

May Kaney's Weird Files Is Out Now

https://kaneysweirdfiles.substack.com/p/may-2026-an-homage-to-monsterquest
1•experiencertim•26m ago•0 comments

Canada unveils national AI strategy

https://www.cbc.ca/news/politics/carney-ai-strategy-9.7223236
2•nigelgutzmann•26m ago•0 comments