frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Fast and Quality Code Chunking with Chonkie

1•snyy•7mo ago
Hi HN,

We’re Chonkie (https://github.com/chonkie-inc/chonkie) — we build open source tools that help split documents into meaningful chunks for use with AI models.

When you use LLMs over large documents or codebases, you often need to break them into smaller parts to fit the model’s context window. Our chunkers do this in a smart way: they preserve structure and meaning, so only the most relevant pieces are passed into the model. This reduces hallucinations, avoids confusion, and improves performance and accuracy.

Today we’re launching our Code Chunker — a fast, structure-aware way to break down source code into high-quality, token-aware chunks.

How it works:

(See the code: https://github.com/chonkie-inc/chonkie/blob/main/src/chonkie...)

Code Chunker uses tree-sitter (https://tree-sitter.github.io/tree-sitter/) to parse your code into an abstract syntax tree (AST). It then recursively merges and groups nodes in a way that respects both code structure and token limits.

It supports all languages that tree-sitter supports, and is designed to preserve formatting and semantics. Large functions or class definitions won’t be split in the middle of a block — instead, we dive recursively into the AST to produce clean, coherent chunks that fit your configured token budget.

What it’s useful for:

  - Embedding-based code search

  - RAG (retrieval-augmented generation) over codebases

  - Long-context analysis of code

  - Preparing repos for fine-tuning or pretraining
Try it out:

  - Open source package: https://docs.chonkie.ai/chunkers/code-chunker

  - Hosted playground (free with account): https://cloud.chonkie.ai
Happy Chonking!

Building small Docker images faster

https://sgt.hootr.club/blog/docker-protips/
1•steinuil•33s ago•0 comments

Show HN: IdeaWell – Project ideas inspired by Hacker News discussions

https://ideawell.fly.dev/
1•Igor_Wiwi•46s ago•0 comments

A Better iPhone Typing Experience (2018)

https://medium.com/porsager/a-better-iphone-typing-experience-77c6da52131
1•walterbell•2m ago•0 comments

How a Tugboat Works – Voith Schneider [video]

https://www.youtube.com/watch?v=iPSTwqUKHvs
1•keepamovin•7m ago•0 comments

Two New React 19 Vulnerabilities

https://vercel.com/kb/bulletin/security-bulletin-cve-2025-55184-and-cve-2025-55183
2•edweis•10m ago•0 comments

K8s-1M: Unintentionally reinventing Google Borg to scale Kubernetes

https://bchess.github.io/k8s-1m/index.html#_meeting_the_qps_needs_for_a_1m_node_cluster
1•matesz•11m ago•0 comments

EU agrees to indefinitely immobilise €210B of Russian assets

https://www.euronews.com/my-europe/2025/12/11/eu-triggers-emergency-clause-to-indefinitely-immobi...
2•saubeidl•13m ago•0 comments

Kent L Beck: You're Ignoring Optionality and Paying for It

https://maintainable.fm/episodes/kent-l-beck-youre-ignoring-optionality-and-paying-for-it
1•thunderbong•14m ago•0 comments

Typeslayer – a TypeScript types performance tool

https://www.youtube.com/watch?v=IP6EZXzXBzY
1•todsacerdoti•21m ago•0 comments

Show HN: A simple web app to memorise Hiragana

https://app.tolearnjapanese.com/
1•bryanhogan•21m ago•0 comments

Overview of the Memory Market in Mid-December 2025

https://hanchouhsu.substack.com/p/overview-of-the-memory-market-in
2•walterbell•23m ago•0 comments

Revolutionizing Lighting: How Smart LEDs Are Transforming Homes and Businesses

2•emmasuntech•24m ago•0 comments

The era of Mobile app automation is here

https://negi-priya1510.medium.com/the-era-of-mobile-ai-agents-is-already-here-5b0a585da642
2•Messyflame•25m ago•0 comments

Suneung: South Korea exam chief quits over 'insane' English test

https://www.bbc.com/news/articles/c3w792x0ggyo
2•ZeljkoS•29m ago•0 comments

Who Wins CS Best Paper Awards?

https://jeffhuang.com/computer-science-open-data/#who-wins-cs-best-paper-awards
1•atomicnature•29m ago•0 comments

Bloody Black Friday for Hardware [video]

https://www.youtube.com/watch?v=V0z9UME9AlE
1•alecco•31m ago•1 comments

Mizu – A lightweight web framework for Go

https://docs.go-mizu.dev/overview/intro
1•tamnd•31m ago•0 comments

You are dating an ecosystem

https://www.razor.blog/2025/12/you-will-never-be-in-two-person.html
1•razor_blog•35m ago•0 comments

Holimization: Why Optimization Is Not Enough

https://www.lokad.com/blog/2025/12/12/holimization-why-optimization-is-not-enough/
1•vermorel•35m ago•0 comments

Microsoft fights $2.8B UK lawsuit over cloud computing licences

https://finance.yahoo.com/news/microsoft-fights-2-8-billion-140519428.html
3•croes•36m ago•0 comments

Cargo Bikes Size Comparator

https://bikes.louiseveillard.com/
2•tarball•37m ago•0 comments

Ancient undersea wall dating to 5,800 BC discovered off French coast

https://phys.org/news/2025-12-ancient-undersea-wall-dating-bc.html
2•daoboy•37m ago•1 comments

Celebrate New Year's with Memorable Train Journeys

https://business-class.us/celebrate-new-years-with-memorable-train-journeys/
1•belatwing•38m ago•1 comments

Rise in violence against women journalists and activists linked to digital abuse

https://apnews.com/article/un-women-report-rise-violence-online-66c38bb80b79d64be18b477f209c2db0
2•binning•41m ago•0 comments

Self Vaping Vape Machines

https://indiantinker.bearblog.dev/adding-more-smoke-to-your-projects/
2•indiantinker•41m ago•0 comments

A Lisp Interpreter Implemented in Conway's Game of Life

https://woodrush.github.io/blog/posts/2022-01-12-lisp-in-life.html
2•fanf2•41m ago•0 comments

Critical design flaw in women's running shoes, scientists warn

https://www.sciencefocus.com/news/critical-problem-womens-running-shoes
1•binning•43m ago•0 comments

Geonimo – AI Geo/SEO Agent

https://www.geonimo.com/
1•guikioky•45m ago•0 comments

Are we stuck with the same Desktop UX forever? [video]

https://www.youtube.com/watch?v=1fZTOjd_bOQ
2•todsacerdoti•47m ago•0 comments

Show HN: Svelte Bash – A lightweight terminal component for Svelte 5

https://github.com/YusufCeng1z/svelte-bash
1•yusufcengiz•48m ago•0 comments