frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Red teamers arrested conducting a penetration test

https://www.infosecinstitute.com/podcast/red-teamers-arrested-conducting-a-penetration-test/
1•begueradj•6m ago•0 comments

Show HN: Open-source AI powered Kubernetes IDE

https://github.com/agentkube/agentkube
1•saiyampathak•10m ago•0 comments

Show HN: Lucid – Use LLM hallucination to generate verified software specs

https://github.com/gtsbahamas/hallucination-reversing-system
1•tywells•12m ago•0 comments

AI Doesn't Write Every Framework Equally Well

https://x.com/SevenviewSteve/article/2019601506429730976
1•Osiris30•15m ago•0 comments

Aisbf – an intelligent routing proxy for OpenAI compatible clients

https://pypi.org/project/aisbf/
1•nextime•16m ago•1 comments

Let's handle 1M requests per second

https://www.youtube.com/watch?v=W4EwfEU8CGA
1•4pkjai•17m ago•0 comments

OpenClaw Partners with VirusTotal for Skill Security

https://openclaw.ai/blog/virustotal-partnership
1•zhizhenchi•17m ago•0 comments

Goal: Ship 1M Lines of Code Daily

2•feastingonslop•27m ago•0 comments

Show HN: Codex-mem, 90% fewer tokens for Codex

https://github.com/StartripAI/codex-mem
1•alfredray•30m ago•0 comments

FastLangML: FastLangML:Context‑aware lang detector for short conversational text

https://github.com/pnrajan/fastlangml
1•sachuin23•33m ago•1 comments

LineageOS 23.2

https://lineageos.org/Changelog-31/
1•pentagrama•37m ago•0 comments

Crypto Deposit Frauds

2•wwdesouza•38m ago•0 comments

Substack makes money from hosting Nazi newsletters

https://www.theguardian.com/media/2026/feb/07/revealed-how-substack-makes-money-from-hosting-nazi...
2•lostlogin•38m ago•0 comments

Framing an LLM as a safety researcher changes its language, not its judgement

https://lab.fukami.eu/LLMAAJ
1•dogacel•40m ago•0 comments

Are there anyone interested about a creator economy startup

1•Nejana•42m ago•0 comments

Show HN: Skill Lab – CLI tool for testing and quality scoring agent skills

https://github.com/8ddieHu0314/Skill-Lab
1•qu4rk5314•42m ago•0 comments

2003: What is Google's Ultimate Goal? [video]

https://www.youtube.com/watch?v=xqdi1xjtys4
1•1659447091•42m ago•0 comments

Roger Ebert Reviews "The Shawshank Redemption"

https://www.rogerebert.com/reviews/great-movie-the-shawshank-redemption-1994
1•monero-xmr•44m ago•0 comments

Busy Months in KDE Linux

https://pointieststick.com/2026/02/06/busy-months-in-kde-linux/
1•todsacerdoti•45m ago•0 comments

Zram as Swap

https://wiki.archlinux.org/title/Zram#Usage_as_swap
1•seansh•58m ago•1 comments

Green’s Dictionary of Slang - Five hundred years of the vulgar tongue

https://greensdictofslang.com/
1•mxfh•59m ago•0 comments

Nvidia CEO Says AI Capital Spending Is Appropriate, Sustainable

https://www.bloomberg.com/news/articles/2026-02-06/nvidia-ceo-says-ai-capital-spending-is-appropr...
1•virgildotcodes•1h ago•2 comments

Show HN: StyloShare – privacy-first anonymous file sharing with zero sign-up

https://www.styloshare.com
1•stylofront•1h ago•0 comments

Part 1 the Persistent Vault Issue: Your Encryption Strategy Has a Shelf Life

1•PhantomKey•1h ago•0 comments

Show HN: Teleop_xr – Modular WebXR solution for bimanual robot teleoperation

https://github.com/qrafty-ai/teleop_xr
1•playercc7•1h ago•1 comments

The Highest Exam: How the Gaokao Shapes China

https://www.lrb.co.uk/the-paper/v48/n02/iza-ding/studying-is-harmful
2•mitchbob•1h ago•1 comments

Open-source framework for tracking prediction accuracy

https://github.com/Creneinc/signal-tracker
1•creneinc•1h ago•0 comments

India's Sarvan AI LLM launches Indic-language focused models

https://x.com/SarvamAI
2•Osiris30•1h ago•0 comments

Show HN: CryptoClaw – open-source AI agent with built-in wallet and DeFi skills

https://github.com/TermiX-official/cryptoclaw
1•cryptoclaw•1h ago•0 comments

ShowHN: Make OpenClaw respond in Scarlett Johansson’s AI Voice from the Film Her

https://twitter.com/sathish316/status/2020116849065971815
1•sathish316•1h ago•2 comments
Open in hackernews

Show HN: Advanced Chunking in JavaScript/TypeScript with Chonkie

10•snyy•8mo ago
Hi HN,

We’re Shreyash and Bhavnick. We built Chonkie, an open-source library for advanced chunking and embedding of text and code. It was previously Python-only, but we just released a TypeScript version: https://github.com/chonkie-inc/chonkie-ts

Many AI projects in JS/TS (like those using Vercel's AI SDK or Mastra) rely on basic text splitters. But better chunking = better retrieval = better performance. That’s what Chonkie is built for.

Current native chunkers (in TS):

- Code Chunker – handles Python, TypeScript, etc.

- Recursive Chunker – rule-based, hierarchical splitting

- Token Chunker – split by token count (fully customizable)

- Sentence Chunker – split on sentence boundaries. Delimiters are customizable, so it works for multiple languages.

All chunkers support custom tokenizers, chunk overlap, delimiters, and more.

Coming soon in native TS (already available via the API client):

- Semantic Chunker – splits texts wherever it detects a shift in meaning.

- SDPM Chunker – merges semantically similar disjoint chunks

- Late Chunker – generates context-aware embeddings for each chunk

- Slumber Chunker – LLM-refined recursive chunks. Significantly reduces token usage (and thus cost) while maximizing chunk quality.

- Embeddings Refinery - Embed chunks with any embedding model

- Overlap Refinery – Create overlaps between consecutive chunks for better context preservation.

Chonkie is free, open-source, and MIT licensed. GitHub: https://github.com/chonkie-inc/chonkie-ts

We’d love your feedback, ideas, or contributions. Thanks!

Comments

skeptrune•8mo ago
I love the typescript library. Python has always been friction for me when building AI apps. I'm going to deploy them on the web and, therefore, would prefer to have web-native tooling.
snyy•8mo ago
Glad you like it!
lennertjansen•8mo ago
when do you think overlap refinery will be available in the ts library? and how does it work?
snyy•8mo ago
We're aiming to launch it by Monday
brianykim•8mo ago
Are there other ways we might be able to tune the chunkers or describe the data that we might want chunked to get the best results?

Or perhaps in the playground a way to easily given a type of input data run different chunkers side by side, or pipe them into each other to see best results?

snyy•8mo ago
We don't have this yet but we will soon. Finding the right setup for your data is definitely tougher than it needs to be.