frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•10mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Meta Confronts Rogue AI Agents After Data Exposure

https://www.findarticles.com/meta-confronts-rogue-ai-agents-after-data-exposure/
1•tortilla•3m ago•0 comments

Can Reese's have too many pieces? The line-extension conundrum

https://www.ft.com/content/5b54115e-ff07-4063-a335-50834f6d79a3
1•petethomas•4m ago•0 comments

FreeFlow – seamless speech to text in any app

https://github.com/build-trust/freeflow/tree/main
1•glenngillen•7m ago•0 comments

Afroman wins Ohio lawsuit over music videos: 'We did it – freedom of speech '

https://www.nbcnews.com/pop-culture/pop-culture-news/afroman-sued-ohio-deputies-music-video-showi...
3•jpster•8m ago•1 comments

Meta will invest $600B in the US, focused on AI data centers

https://www.engadget.com/ai/meta-says-it-will-invest-600-billion-in-the-us-with-ai-data-centers-f...
1•mizzao•10m ago•0 comments

If you are not doing $10k per month, your problem is not what you think it is

1•shoman3003•10m ago•0 comments

No Semicolons Needed

https://terts.dev/blog/no-semicolons-needed/
1•aw1621107•12m ago•0 comments

Tiny Fish Passed an Intelligence Test That Once Distinguished Great Apes

https://www.sciencealert.com/this-tiny-fish-passed-an-intelligence-test-that-once-distinguished-g...
1•MBCook•12m ago•0 comments

New Open Source Release

1•BrainDAnderson•12m ago•0 comments

SkillNet: Create, Evaluate, and Connect AI Skills

https://arxiv.org/abs/2603.04448
1•navikohli•12m ago•0 comments

Conway's Game of Life, in real life

https://lcamtuf.substack.com/p/conways-game-of-life-in-real-life
2•surprisetalk•19m ago•0 comments

CJ: Let your Agents send you Updates

https://clawjetty.com/####
1•andes314•22m ago•0 comments

MiniMax-M2.7 Announced

https://old.reddit.com/r/LocalLLaMA/comments/1rwvn6h/minimaxm27_announced/
1•novateg•34m ago•0 comments

Show HN: Jninty – Track seeds, plants, and harvests across seasons (open source)

https://github.com/HapiCreative/jninty
1•elmadah•35m ago•0 comments

Tool to visualize CVE attack chains

https://vulnpath.vercel.app/app
2•yongsanghoon•37m ago•1 comments

Show HN: Ashlr AO – AI agent orchestration dashboard (for sale, $9.5K)

https://ashlrao.com
1•masonwyatt23•37m ago•0 comments

Vector Search with LLMs [video]

https://www.youtube.com/watch?v=YDdKiQNw80c
1•tartoran•39m ago•0 comments

Share of Labour Compensation in GDP for United States (1950-2023)

https://fred.stlouisfed.org/series/LABSHPUSA156NRUG
3•greyface-•41m ago•0 comments

Mozilla to launch free built-in VPN in upcoming Firefox 149

https://cyberinsider.com/mozilla-to-launch-free-built-in-vpn-in-upcoming-firefox-149/
9•adrianwaj•43m ago•1 comments

Reddit New Post 4

https://old.reddit.com/r/PisequaltoNP/comments/1rxpkj9/pnp_a_rigorous_proof/
1•KaoruAK•45m ago•0 comments

Costa Rica makes history with strict, nationwide ban on hunting

https://www.thecooldown.com/outdoors/costa-rica-hunting-ban-wildlife-protection/
5•thunderbong•52m ago•1 comments

Fed fake paper to NotebookLM and ask two AI podcast hosts what it means for them

https://lluminate.substack.com/p/the-inevitability-of-love-between
2•bmedwar•52m ago•1 comments

How to Correct the Financial Times at AWS (So Far)

https://www.lastweekinaws.com/blog/2-ways-to-correct-the-financial-times-at-aws-so-far/
3•shscs911•53m ago•0 comments

Identity-first containment for autonomous agent workloads (SPIFFE and Istio lab) [pdf]

https://github.com/computeaholic/threadforge-agent-containment-lab/blob/main/docs/Agent-Containme...
2•computeaholic•54m ago•0 comments

Which AI model is best for Laravel?

https://laravel.com/blog/which-ai-model-is-best-for-laravel
2•sawirricardo•56m ago•0 comments

Agent Talks to Agent

https://medium.com/@yingjunwu/agent-to-agent-communication-is-broken-why-an-email-like-inbox-mode...
4•AnneWodell•1h ago•0 comments

Rotating the Space: On LLMs as a Medium for Thought

https://sbgeoaiphd.github.io/rotating_the_space/
3•andruc•1h ago•0 comments

Vintage FUD: when Microsoft declared WebGL harmful

https://twitter.com/mrdoob/status/2034374035862413592
3•bpierre•1h ago•0 comments

12 years later, Hidden Path releases first DLC for Defense Grid 2

https://store.steampowered.com/app/1080920/Defense_Grid_2__Aftermath/
1•HardwareLust•1h ago•1 comments

Idea Killshot: stress-test your startup idea before the market does

https://hnshah.github.io/idea-killshot/
2•oatis-ai•1h ago•1 comments