frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•1y ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

A New Chapter for Contentful: Scaling Our Vision with Salesforce

https://www.contentful.com/blog/a-new-chapter-for-contentful/
1•ipmb•39s ago•0 comments

Meteor Explodes over Massachusetts

https://www.nbcboston.com/news/local/meteor-explodes-over-massachusetts-what-we-know-and-where-it...
1•1970-01-01•2m ago•0 comments

Resident group's objections to bar licences 'destroying Soho's reputation'

https://www.theguardian.com/uk-news/2026/may/30/bar-restaurant-licence-challenges-destroying-soho...
1•mellosouls•4m ago•0 comments

Remote work, not AI, has sidelined recent college graduates, research finds

https://text.npr.org/nx-s1-5843076
1•hi41•6m ago•1 comments

Rebuilding isitchristmas.com with Claude's dynamic workflows (and 484 agents)

https://benjaminste.in/isitchristmas/
1•benstein•6m ago•1 comments

Surf exposed webcams like TV channels

https://alec.is/posts/building-omegle-for-exposed-webcams/
1•arm32•7m ago•0 comments

NBD-VRAM Provides Swap Space on Your Nvidia GeForce GPUs

https://www.phoronix.com/news/NVIDIA-NBD-VRAM
1•Bender•7m ago•0 comments

Niri Is Not for Me

https://arijan.dev/posts/niri-not-for-me/
1•arijanj•7m ago•0 comments

Linux 7.2 Proceeding to Deprecate Af_alg Due to "Massive Attack Surface"

https://www.phoronix.com/news/Linux-AF-ALG-Deprecation
2•Bender•8m ago•1 comments

Intel Preparing WiFi 8 "UHR" Support for Their Iwlwifi Linux Driver

https://www.phoronix.com/news/Intel-IWL-WiFi-UHR-Linux-7.2
1•Bender•8m ago•0 comments

Bringing Goodnotes to the Web with Swift and WebAssembly

https://swift.org/blog/bringing-goodnotes-to-web-with-swift/
1•frizlab•8m ago•0 comments

Remote Work Leaves Younger Workers Sidelined

https://libertystreeteconomics.newyorkfed.org/2026/06/remote-work-leaves-younger-workers-sidelined/
1•orthogonal_cube•8m ago•0 comments

Concord, an Alternative

https://github.com/ryttps94jq-gif/concord-cognitive-engine
1•dutchtropez•9m ago•1 comments

Running local RAG AI on MacBook neos

https://securethink.co.uk/
1•hubsy•9m ago•0 comments

You Don't Love Systemd Timers Enough

https://blog.tjll.net/you-dont-love-systemd-timers-enough/
1•birdculture•9m ago•0 comments

Stanley Cup Final Starts Tomorrow – VGK vs. Car

https://bet-props.com/
1•Julle•9m ago•0 comments

Wise investigated for half a billion in suspicious transactions

https://www.thebureauinvestigates.com/stories/2026-06-01/money-transfer-giant-wise-investigated-f...
1•dgellow•10m ago•0 comments

How to Talk to Your Coworkers

https://idiallo.com/blog/how-to-talk-to-your-coworkers
1•foxfired•12m ago•0 comments

Show HN: Soft Body Jiggle Physics

https://github.com/xloveee/jiggle-physics
1•vesperance•13m ago•0 comments

We Are Living in Pinocchio's World

https://om.co/2026/05/25/we-are-living-in-pinocchios-world/
3•mattas•14m ago•0 comments

Garry Tan: Stop building Foxconn factories for your agents

https://twitter.com/garrytan/status/2061454423034110372
2•Umofomia•14m ago•0 comments

Anthropic Opus 4.8 is new SOTA on ARC-AGI-3, Score: 1.5%, –$10K

https://xcancel.com/arcprize/status/2061512025638121516
2•szatkus•14m ago•0 comments

Microsoft's Postgres VS Code extension now available for Cursor

https://techcommunity.microsoft.com/blog/adforpostgresql/your-postgresql-workflow-just-found-its-...
2•aquilaFiera•15m ago•0 comments

Computex 2026: Intel Launches Crescent Island GPU with Up to 480GB VRAM

https://www.neowin.net/news/computex-2026-intel-launches-crescent-island-gpu-with-up-to-480gb-vram/
1•theanonymousone•15m ago•1 comments

Default Bias: Who chose your settings?

https://designexplained.substack.com/p/default-bias-who-chose-your-settings
1•kaizenb•19m ago•0 comments

Show HN: Integrated Music Composition

https://bookerapp.replit.app/book/music-composition/
1•ersinesen•21m ago•0 comments

Intel: Our upcoming AI chip will be cheaper, run cooler than Nvidia, AMD options

https://arstechnica.com/ai/2026/06/intel-our-upcoming-ai-chip-will-be-cheaper-run-cooler-than-nvi...
4•tambourine_man•21m ago•1 comments

CVE-2026-41089

https://gemini.google.com/share/ab8ed0f5c0ec
1•redog•21m ago•1 comments

Did Lab Insects Learned That the Smell of DEET Would Lead Them to a Tasty Treat?

https://www.smithsonianmag.com/smart-news/could-bug-spray-attract-mosquitoes-lab-insects-learned-...
1•Vaslo•24m ago•0 comments

A per-project open-source Claude Code skill manager

https://github.com/narendranag/skillkit
2•narendranag•25m ago•1 comments