frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•8mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Ask HN: Should agentic AI run on isolated env, if yes how?

1•firemelt•1m ago•0 comments

Congress.gov User Interview Update and Call for More Volunteers

https://blogs.loc.gov/law/2025/12/congress-gov-user-interview-update-and-call-for-more-volunteers/
1•m-hodges•2m ago•0 comments

Show HN: Cascade – AI agent that optimizes your ads across channels

https://cascaded.ai/en/
1•Kazza_cascade•2m ago•0 comments

The Case for Firebase in 2026

1•daywards•3m ago•0 comments

Show HN: Self Hosted Claude Code Runner

https://github.com/ericvtheg/claude-code-runner
1•ericvtheg•6m ago•0 comments

Meta Acquires Manus

https://www.cnbc.com/2025/12/30/meta-acquires-singapore-ai-agent-firm-manus-china-butterfly-effec...
2•carlual•6m ago•0 comments

Texas community votes no on incorporating to fight Bitcoin mine

https://www.texastribune.org/2025/11/05/texas-hood-county-bitcoin-noise-city-vote-fail/
1•walterbell•8m ago•0 comments

EpisodeElf: A simple, free, plugable tracker for TV shows

https://episodeelf.com/
1•thunderbong•8m ago•0 comments

How I made a tech support AI Agent that troubleshoots tickets using the Grok API

https://www.youtube.com/watch?v=ZK-R9LbOHhI
1•erikbatista42•12m ago•0 comments

Meta Superintelligence Labs acquires Manus AI for –$4B, 9 months after launch

https://news.smol.ai/issues/25-12-29-meta-manus/
1•swyx•13m ago•1 comments

Walmart Leaving the New York Stock Exchange for Nasdaq in Rebranding Effort

https://www.npr.org/2025/12/02/nx-s1-5619786/walmart-leaving-the-new-york-stock-exchange-for-nasd...
1•LopRabbit•20m ago•0 comments

China's Push to Master the Arctic Opens an Alarming Shortcut to U.S.

https://www.wsj.com/world/china-arctic-military-submarines-b4e988b9
4•JumpCrisscross•21m ago•0 comments

My 2025 AI Developer Year in Review

https://scottw.com/2025-ai-developer-year/
1•Kerrick•25m ago•0 comments

Chinese AI 'tiger' Zhipu edges towards Hong Kong listing expected to raise $300M

https://www.scmp.com/business/article/3337171/chinese-ai-tiger-zhipu-edges-towards-hong-kong-list...
2•doppp•36m ago•0 comments

My Self

2•yigojpnyc•43m ago•1 comments

Baltimore homicides declined furthest, fastest, could reach a 48-year low

https://www.thebanner.com/community/criminal-justice/baltimore-homicides-decline-48-year-low-U3UF...
3•xqcgrek2•44m ago•0 comments

Show HN: I built an MCP server to trade Robinhood through Claude Code

https://github.com/trayders/trayd-mcp
2•teamtrayd•47m ago•1 comments

Show HN: Endpoint State Policy – Policy as Data

https://github.com/scanset/Endpoint-State-Policy
1•scanset•51m ago•1 comments

Show HN: Signing Room – Stateless Bitcoin Multisig Coordinator

https://signingroom.io
3•scarlin90•53m ago•0 comments

I made a simple API testing framework [Alpha]

https://github.com/cd-4/yapitest
1•cd-4•56m ago•1 comments

Value Investing Is Struggling to Remain Relevant (2020)

https://www.economist.com/briefing/2020/11/14/value-investing-is-struggling-to-remain-relevant
1•jcartw•59m ago•1 comments

Next-Gen Big Data Dashboards – For All Industries

https://dashtera.com/
1•abhimattoria•1h ago•0 comments

The Prime IDs of Hacker News

https://dosaygo-studio.github.io/prime-news/?filter=fermat&p=1
1•keepamovin•1h ago•0 comments

Decades-old mystery solved as scientists identify what makes ice slippery

https://www.thebrighterside.news/post/decades-old-mystery-solved-as-scientists-identify-what-real...
3•thunderbong•1h ago•1 comments

Meta to acquire Chinese startup Manus to boost advanced AI features

https://www.reuters.com/world/china/meta-acquire-chinese-startup-manus-boost-advanced-ai-features...
2•testrun•1h ago•0 comments

The human fingerprint of medicinal plant species diversity

https://www.cell.com/current-biology/fulltext/S0960-9822(25)01250-3
2•PaulHoule•1h ago•0 comments

Compile JavaScript to C with Static Hermes

https://devongovett.me/blog/static-hermes.html
2•rexpan•1h ago•0 comments

Transform Your Data Effortlessly

https://documain.ai/
1•abahjat•1h ago•1 comments

Converting between geographic and geocentric latitude

https://www.johndcook.com/blog/2025/12/29/geographic-vs-geocentric-latitude/
2•ibobev•1h ago•0 comments

Freedom from Incompetence

https://lemire.me/blog/2025/12/29/freedom-from-incompetence/
3•ibobev•1h ago•0 comments