news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: I embedded 685M public texts in 32 minutes (on 8x A100, Rust, TensorRT)

https://github.com/Artain-AI/ignite-ms

3•ddayanov•1h ago

Comments

ddayanov•43m ago

Quick note on how it works and how I've done my batch embedding engine IgniteMS.

The whole thing runs as one process using Rust, reading input, tokenizing, packing batches, keeping the queue full. TensorRT handles inference. Python is only as a wrapper.

I built it this way because when you use more than couple of GPUs, the GPUs stop being the problem. CPU cannot feed them fast enough. One A100 can go through batches faster than Python can tokenize and feed, so the GPU just sits there idle waiting for work. Most of my time went into optimizing this. At 8 GPUs that was basically the entire challenge.

On cost. I ran the big 2B messages job on a spot p4d instance (8x A100 40GB). After filtering and dedupping I got 685M raw texts. With my new engine the whole production run finishes in about half an hour. Previously I used on-demand for these jobs, now switched to spots. If AWS reclaims the box, I just rerun it. It's roughly $7 for half-an-hour run. And at least right now spots are easier to get than on-demand.

Open warning: it's batch only and NVIDIA only. You can use it both as a docker image and native. I used some optimizations for my production run. With default settings you can expect to see ~250K msg/sec if you run the benchmark script on your p4d box. https://github.com/Artain-AI/ignite-ms/blob/main/BENCHMARKIN...

v1.1.0 added TensorRT 11 and 60 models, 23 tested on 1x and 4x A100.

Happy to share details.

I forked Bettercanvas as a free and open source extension and published it

https://chromewebstore.google.com/detail/canvasrefined/ihienfbdfdamhmhhiokjnjmpjgbenedg

1•GuySan•49s ago•1 comments

Azure Linux 4.0

https://techcommunity.microsoft.com/blog/linuxandopensourceblog/announcing-azure-linux-4-0-purpos...

1•madspindel•1m ago•0 comments

Anina: The discovery infrastructure for the next iconic brands

https://anina.app/

1•Marcelorz•1m ago•1 comments

Show HN: Zerostack, an open coding agent optimized for memory footprint

https://gi-dellav.github.io/zerostack/

1•gidellav•2m ago•0 comments

DeepSWE results are unreliable – 3/3 DSv4 "failed" tasks solved with same model

https://github.com/datacurve-ai/deep-swe/issues/21

1•theanonymousone•2m ago•0 comments

Show HN: Recursi – self-improving LLM-connected coding environment

https://recursi.dev/

1•robbrown451•3m ago•0 comments

Trump plans $700M in new coal support

https://www.reuters.com/legal/litigation/trump-plans-700-million-new-coal-support-white-house-off...

2•JumpCrisscross•3m ago•0 comments

Notes about a random free project I did 30 days ago (yt video transcriptions)

1•cristyg0101•4m ago•0 comments

No Use of AI Is Ethical

https://efturnip.substack.com/p/no-use-of-ai-is-ethical

2•dopple•4m ago•0 comments

Google Search adding profile pages for websites and creators

https://9to5google.com/2026/06/04/google-search-profiles/

1•geox•4m ago•0 comments

Browser-Based OAuth Client: The architecture you shouldn't be using

https://fusionauth.io/blog/browser-based-oauth-client-security-architecture

1•mooreds•5m ago•0 comments

May Kaney's Weird Files Is Out Now

https://kaneysweirdfiles.substack.com/p/may-2026-an-homage-to-monsterquest

1•experiencertim•6m ago•0 comments

Canada unveils national AI strategy

https://www.cbc.ca/news/politics/carney-ai-strategy-9.7223236

1•nigelgutzmann•7m ago•0 comments

Microsoft continues its big Linux push at Build 2026

https://www.zdnet.com/article/microsoft-continues-its-linux-company-shift/

1•CrankyBear•8m ago•0 comments

Show HN: GoldenMatch – 100M-row dedupe on Ray in 213s, no Spark, Arrow-native

https://github.com/benseverndev-oss/goldenmatch

1•benzsevern•9m ago•0 comments

12,060 piece, $799.99, Sagrada Família is the largest Lego building set to date

https://www.lego.com/en-us/product/sagrada-familia-21065

1•speckx•10m ago•0 comments

TokkeyCC – OpenAI-compatible API for 100 AI models, .22 per 1M tokens

https://tokkeycc.com

1•wangyixiang•11m ago•0 comments

I deleted WebSockets and haven't looked back

https://newsletter.masilotti.com/p/i-deleted-websockets-and-havent-looked

4•joemasilotti•12m ago•0 comments

False Flag

https://thezvi.substack.com/p/ai-171-false-flag

1•7777777phil•13m ago•0 comments

Rodeo – AI that watches your video library and assembles rough cuts from a brief

https://tryrodeo.io/

1•le_james94•13m ago•0 comments

Sentry moved 2,500 page out of their CMS with Claude

https://read.technically.dev/p/how-matt-learned-to-ship

1•damowangcy•14m ago•0 comments

When AI Builds Itself

https://www.anthropic.com/institute/recursive-self-improvement

3•meetpateltech•15m ago•0 comments

In Support of Mandatory Nucleic Acid Synthesis Screening and Recordkeeping [pdf]

https://prod-i.a.dj.com/public/resources/documents/dnaletter.pdf

1•jonbaer•16m ago•0 comments

Logits as a new monitor for evaluation awareness

https://www.lesswrong.com/posts/PK7ZvFZxrgpYtrpF4/logits-as-a-new-monitor-for-evaluation-awareness-1

1•aranguri•16m ago•0 comments

Show HN: Anthrosevka Mono, an Iosevka Build Inspired by Anthropic Mono

https://github.com/nanxstats/anthrosevka

1•road2stat•16m ago•0 comments

ComChan: A Fast Minimal Serial Monitor with Serial Plottter TUI and More

https://github.com/Vaishnav-Sabari-Girish/ComChan

2•berlianta•18m ago•0 comments

Onos Health – Engineering Hiring in SF – $200-250k and Equity

https://onoshealth.com/

2•onoshealth•18m ago•0 comments

The (lack of) cognitive load of readable code

https://nikkipin.ski/agentanecdotes-the-lack-of-cognitive-load-of-readable-code/

3•pramodbiligiri•19m ago•0 comments

Supreme Court sides with Trump admin on federal regulation of telecom companies

https://apnews.com/article/supreme-court-att-verizon-location-data-fcc-c0d184c82a104d653c8f145235...

4•voxadam•20m ago•0 comments

Coverd – The Recruitment Intelligence Layer

https://coverd.ai

3•gimocimo•20m ago•1 comments