frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Show HN: SlideShow – a simple, cross-platform full-screen photo slideshow

https://github.com/ftaisdeal/cross-platform-slideshow
1•ftaisdeal•3m ago•0 comments

Meta's Zuckerberg says AI agent tech progressing slower than expected

https://finance.yahoo.com/technology/ai/articles/exclusive-zuckerberg-says-ai-agent-201123441.html
1•ssram•4m ago•1 comments

Claude Code Dynamic Island on macOS

https://pookify.vercel.app/
1•eyadh•5m ago•1 comments

War Crimes Archive

https://archivegenocide.com/
3•TomeveilSeeker•5m ago•0 comments

Learn the Knowledge of London

https://tfl.gov.uk/info-for/taxis-and-private-hire/licensing/learn-the-knowledge-of-london
1•haunter•6m ago•0 comments

ElevenLabs at $22B: no new money, just employees selling shares

https://freemalta.com/hub/library/elevenlabs-is-worth-22-billion-no-new-money-came-in
1•ilhaniremyuce•7m ago•0 comments

Atomic Force Microscope video, steel etching, bacteria – Applied Science

https://www.youtube.com/watch?v=DyIQkqBXhS0
1•mhb•9m ago•0 comments

Show HN: Foundera – AI-powered founder and startup feedback platform

https://foundera.app/
1•toyji•10m ago•0 comments

Codex vs. Claude Code

https://www.augmentedswe.com/p/codex-vs-claude-code-2c0
1•wordsaboutcode•10m ago•0 comments

Robotic bird targets drones' biggest aerodynamic shortcoming

https://newatlas.com/drones/robotic-bird-drones-aerodynamic-problem-rmit/
1•breve•13m ago•0 comments

Semantic Manifest – An open specification for AI crawler ingestion

https://github.com/CKL75/semantic-manifest-specification
1•CKL75•15m ago•0 comments

When AI Comes to the Workplace: Ethics, Employee Empowerment and Privacy

https://read.misalignedmag.com/when-ai-comes-to-the-workplace-ethics-employee-empowerment-and-pri...
1•lcubw•15m ago•0 comments

Valve open source the Steam Machine e-ink screen so you can make your own

https://www.gamingonlinux.com/2026/07/valve-open-source-the-steam-machine-e-ink-screen-so-you-can...
5•ahlCVA•15m ago•0 comments

PostgreSQL and the OOM Killer: Why You Must Use Strict Memory Overcommit

https://www.ubicloud.com/blog/postgresql-and-the-oom-killer-why-we-use-strict-memory-overcommit
2•furkansahin•17m ago•0 comments

The Fall and Rise of Screwworm

https://www.construction-physics.com/p/the-fall-and-rise-of-screwworm
2•crescit_eundo•18m ago•0 comments

Collision in space is not evidence of dark matter after all?

https://www.uni-bonn.de/en/news/collision-in-space-is-not-evidence-of-dark-matter-after-all
1•mpweiher•19m ago•0 comments

Show HN: I built a website showing the likelihood of the AI bubble to pop

https://laurentiugabriel.github.io/is-ai-hype-cooling-down/
6•laurentiurad•22m ago•4 comments

I built a Chrome extension that speaks notifications aloud, keywords filtered

https://chromewebstore.google.com/detail/serious-notification-spea/hnaggblalhlbihfaegbknioadncpcged
1•m-bilal-khan•22m ago•0 comments

Embodied.cpp: A Portable Inference Runtime of Embodied AI Models

https://arxiv.org/abs/2607.02501
1•chrsw•22m ago•0 comments

Zuckerberg 'Admits' Meta's Layoffs Were Ineffective

https://eshumarneedi.com/2026/07/03/zuckerberg-admits-metas-layoffs-were.html
4•ExMachina73•22m ago•1 comments

Show HN: Spark KNE Verify – Verify selected AI claims in the browser

https://chromewebstore.google.com/detail/spark-kne-verify/ncegakjnphdojnjlohobacncplbkbfdc
1•knespark•24m ago•0 comments

My Students Hate AI. But They Can't Stop Using It

https://www.chronicle.com/article/my-students-hate-ai-but-they-cant-stop-using-it
3•sseagull•24m ago•1 comments

Gemini Code Assist will be shut down on July 17

https://docs.cloud.google.com/gemini/docs/code-review/review-repo-code
2•ushakov•24m ago•0 comments

X has suddenly banned an account documenting Trump's corrupt stock trades

https://twitter.com/HQNewsNow/status/2072699828337864871
6•doener•25m ago•1 comments

Toward Better Hip Kernel Generation for AMD GPUs

https://scalingintelligence.stanford.edu/blogs/hipkernels/
1•skidrow•26m ago•0 comments

Please Stop the AI Confidence Theater

https://www.elenaverna.com/p/please-stop-the-ai-confidence-theater
2•skadamat•26m ago•0 comments

Failed blockchain project ends with big fine for fibs about it being on track

https://www.theregister.com/software/2026/07/03/failed-blockchain-project-ends-with-big-fine-for-...
1•jnord•26m ago•0 comments

KathaGPT – private AI desktop app

https://santoshpremi.github.io/KathaGPT/
1•santosh_premi•28m ago•1 comments

Inkterface, an e-ink faceplate for your Steam Machine

https://gitlab.steamos.cloud/SteamHardware/SteamMachine/inkterface
3•microflash•28m ago•1 comments

The Rise of the Command Line: building a new IDE (2017–2026)

https://rune.build/blog/the-rise-of-the-command-line
1•thunderbong•28m ago•0 comments