frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•10mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Show HN: Skillscape – Engineering skills matrix without the spreadsheet

https://www.skillscape.dev/
1•danielyefet•5m ago•0 comments

SimpleSteps – TypeScript-to-ASL Compiler

https://github.com/DevNamedZed/simplesteps
1•aman96_54_3•8m ago•0 comments

Demonstration of Network Tap and Packet Filter Using a Security Camera

https://privateisland.tech/dev/betsy-demo-tap-w-cam
1•mindchasers•9m ago•0 comments

I thought freelancers hated invoices. They hated the tools

https://www.indiehackers.com/post/i-thought-freelancers-hated-invoices-they-actually-hated-the-to...
1•allinonetools_•15m ago•0 comments

ThePrimeagen goes back to traditional coding

https://twitter.com/theprimeagen/status/2026771192191824108
2•rob•18m ago•0 comments

When "technically true" becomes "misleading"

https://www.theargumentmag.com/p/when-technically-true-becomes-actually
1•bananaflag•24m ago•0 comments

Australia's WiseTech to cut 2k jobs as AI renders manual coding obsolete

https://www.computerworld.com/article/4137200/australias-wisetech-to-cut-2000-jobs-as-ai-renders-...
1•netfortius•24m ago•0 comments

CleverMock – An AI voice interviewer that interrupts you like a real human

https://www.clevermock.com
1•devinda-dilshan•25m ago•1 comments

Show HN: Programmatic (and self-updating) SaaS demo videos

https://www.rundown.video/
1•guico•25m ago•0 comments

Show HN: Bing Webmaster CLI for Agents and LLMs

https://github.com/NmadeleiDev/bing_webmaster_cli
1•Gregoryy•28m ago•0 comments

A White House Staffer Appears to Run Pro-Trump X Account

https://www.wired.com/story/a-white-house-staffer-appears-to-run-massive-pro-trump-meme-page/
2•doener•33m ago•2 comments

Show HN: Onera – Private LLM Inference Inside AMD SEV-SNP Enclaves

https://onera.chat
1•shreyaspapi•34m ago•1 comments

Next-Token Predictor Is an AI's Job, Not Its Species

https://www.astralcodexten.com/p/next-token-predictor-is-an-ais-job
1•bananaflag•34m ago•0 comments

Tests Are the New Moat

https://saewitz.com/tests-are-the-new-moat
1•vinhnx•37m ago•1 comments

'Access to Insight' is shutting down

https://www.accesstoinsight.org/
1•bifftastic•38m ago•0 comments

The next batch of fixed Epstein files links and notes is live

https://xcancel.com/IAmAnonLegion/status/2026853415863615662?s=20
1•doener•38m ago•0 comments

Programming has changed dramatically due to AI in the last 2 months (Karpathy)

https://twitter.com/karpathy/status/2026731645169185220
2•bakigul•41m ago•0 comments

Demo of an indie AI collaboration app – beyond Codex and Claude Code desktop

1•seeksky•43m ago•1 comments

AIQuotaBar – macOS menu bar app that shows Claude and ChatGPT usage limits

https://github.com/yagcioglutoprak/AIQuotaBar
1•toprak123•48m ago•1 comments

Git City – Your GitHub as a 3D City

https://www.thegitcity.com/
1•duck•48m ago•2 comments

Mumsnet campaign demands ban on social media for under-16s

https://www.theguardian.com/society/2026/feb/26/mumsnet-campaign-demands-ban-social-media-under-16s
2•pmg101•50m ago•0 comments

Shipcast – Turn your Git commits into tweets, automatically

https://shipcast.dev/
1•guoyu•50m ago•0 comments

Show HN: LucidExtractor – Extract web data in plain English, no selectors

https://lucidextractor.liceron.in
1•yukendiran_j•55m ago•0 comments

A larger cage: about the ongoing calls for "digital sovereignty"

https://www.structural-integrity.eu/a-larger-cage-about-the-ongoing-calls-for-digital-sovereignty/
1•doener•56m ago•0 comments

Earth's heat to power 10k homes in renewable energy first for UK

https://www.bbc.co.uk/news/articles/cewzg77k721o
2•RobinL•56m ago•0 comments

Show HN: Snaplake – Query past database states without restoring backups

https://snaplake.clroot.io
1•clroot•57m ago•0 comments

Show HN: Context Harness – Local first context engine for AI tools

https://github.com/parallax-labs/context-harness
1•__parallaxis•57m ago•0 comments

Perplexity Computer

https://www.perplexity.ai/hub/blog/introducing-perplexity-computer
1•kamaal•57m ago•1 comments

Show HN: I Made an AI Skill to Help Write Tlaps Proofs

https://github.com/younes-io/agent-skills/blob/main/skills/tlaps-workbench/SKILL.md
1•youio•57m ago•0 comments

Implementing a Clear Room Z80 / ZX Spectrum Emulator with Claude Code

https://antirez.com/news/160
2•boyter•59m ago•0 comments