frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

We run 20M models in parallel on Ray

https://mixpeek.com/blog/ray-distributed-ml-pipeline-architecture
1•Beefin•1h ago

Comments

Beefin•1h ago
We process video, images, and documents through 20+ ML models simultaneously at Mixpeek. A single 10-minute video triggers transcription, visual embeddings, scene descriptions, face detection, object detection, brand safety classification, and more — all in parallel with different compute requirements.

We wrote up the full Ray architecture we use in production on KubeRay/GKE. Not a tutorial — more of a "here's what we actually run and what bit us."

Some highlights:

- *Custom resource isolation* — We use a synthetic `{"batch": 1}` resource to prevent batch pipeline tasks from starving Ray Serve inference replicas. Same cluster, zero interference, no runtime overhead.

- *Flexible actor pools* — Fixed-size `ActorPoolStrategy(size=8)` deadlocks when concurrent jobs compete for workers. `min_size=1, max_size=N` guarantees every job can make progress.

- *Shared preprocessing* — Naive approach runs S3 download + format normalization once per extractor. With 10 extractors on 1,000 files, that's 10,000 redundant reads. We preprocess once and fan out via Ray Dataset.

- *Distributed Qdrant writes* — Ray Data's `Datasink` API distributes vector DB writes across all workers with backpressure, instead of collecting everything on one node.

- *Fire-and-forget progress tracking* — A Ray actor as a shared counter lets workers report progress without blocking the pipeline.

- *Zero-CPU head node* — Learned this one the hard way when a runaway batch job took down our scheduler.

The post includes the KubeRay YAML, Ray Serve autoscaling configs, pipeline code, and the LocalStack parquet workaround that saved us hours of debugging silent hangs.

https://mixpeek.com/blog/ray-distributed-ml-pipeline-archite...

Happy to answer questions about any of the patterns or trade-offs.

Fixing Slow AWS Uploads

https://pierce.dev/notes/fixing-slow-aws-uploads
2•speckx•1m ago•0 comments

Show HN: Raindrop Self Diagnostics: let agents self-report issues

https://twitter.com/benhylak/status/2026712861666587086
2•alexisgauba•1m ago•0 comments

Toilet Map [UK]

https://www.toiletmap.org.uk
1•petecooper•1m ago•0 comments

From Jamstack to CAMstack – Bridging the Content Gap

https://www.sleekcms.com/blog/from-jamstack-to-camstack
2•yusufnb•2m ago•1 comments

The Pentagon Threatens Anthropic

https://www.astralcodexten.com/p/the-pentagon-threatens-anthropic
2•lukeplato•2m ago•0 comments

The Myth of the Chad

https://www.wsj.com/opinion/free-expression/the-myth-of-the-chad-b7626d85
2•rsecora•3m ago•0 comments

om

https://www.om-language.com/
1•tosh•3m ago•0 comments

Fentanyl or phony? Machine learning algorithm learns opioid signatures

https://phys.org/news/2026-02-fentanyl-phony-machine-algorithm-opioid.html
2•PaulHoule•4m ago•0 comments

Time-Travel Debugging: Replaying Production Bugs Locally

https://lackofimagination.org/2026/02/time-travel-debugging-replaying-production-bugs-locally/
1•tie-in•4m ago•0 comments

Show HN: Djevops – Deploy Django Easily

https://github.com/mherrmann/djevops
2•mherrmann•5m ago•0 comments

A federal experiment opens up a new market for digital health – if it works

https://endpoints.news/a-federal-experiment-opens-up-a-new-market-for-digital-health-if-it-works/
1•brandonb•5m ago•0 comments

Aletheia Tackles FirstProof Autonomously

https://arxiv.org/abs/2602.21201
1•in-silico•5m ago•0 comments

Show HN: Mamba3-minimal – PyTorch implementation of Mamba-3

https://github.com/VikramKarLex/mamba3-minimal
1•vikramkarlex•7m ago•0 comments

Show HN: DRYwall – Claude Code plugin to to deduplicate code with jscpd

https://github.com/nikhaldi/drywall
1•nikhaldi•8m ago•0 comments

Stylometry Protection (Using Local LLMs)

https://bible.beginnerprivacy.com/opsec/stylometry/
1•Cider9986•10m ago•0 comments

Surfboard Makers

https://miren.dev/blog/surfboard-makers
1•veverkap•10m ago•1 comments

Don't ask if it works. Ask for proof

https://charlielabs.ai/blog/dont-ask-if-it-works-ask-for-proof/
1•mrbbk•11m ago•0 comments

Perplexity Computer: research, design, code, deploy, and manage any project

https://www.perplexity.ai/help-center/en/articles/13837784-what-is-computer
2•rob•12m ago•1 comments

Show HN: Guard – An open-core governance layer for AI-generated code

https://mindforge.run/#how-it-works
1•veeduzyl•12m ago•0 comments

Sandboxes won't save you from OpenClaw

https://tachyon.so/blog/sandboxes-wont-save-you
2•logicx24•15m ago•0 comments

Grail’s Cancer Detection Test Fails in Major Study

https://www.nytimes.com/2026/02/20/health/cancer-detection-test-grail.html
1•gmays•15m ago•0 comments

AI has gotten good at finding bugs, not so good at swatting them

https://www.theregister.com/2026/02/24/ai_finding_bugs/
2•Bender•17m ago•0 comments

Fire Them All; God Will Know His Own

https://www.thecrimson.com/article/2022/11/29/anderson-bureaucratic-bloat-harvard/
2•rd•17m ago•0 comments

Aion Longevity iOS App

https://apps.apple.com/gb/app/aion-longevity/id6758638095
1•nevenp•18m ago•0 comments

MicroTimes Interviews Borland's Philippe Kahn (1985)

https://computeradsfromthepast.substack.com/p/microtimes-interviews-borlands-philippe
1•rbanffy•18m ago•0 comments

C++ Default Constructor Riddle

http://cryp.to/default-constructor-riddle/
1•dddnzzz334•19m ago•0 comments

Show HN: Go-GATE – Database-grade safety for AI agents

https://github.com/billyxp74/go-gate
1•billyxp74•20m ago•0 comments

Analyzing Latency Hiding and Parallelism in an MLIR-Based AI Kernel Compiler

https://arxiv.org/abs/2602.20204
1•matt_d•21m ago•0 comments

Show HN: A site only LLM can access

https://anti-human.vercel.app/
1•aniketsauravv•21m ago•0 comments

JetStream NATS.io C#: Example primitive for composite learning, reading data

https://github.com/nats-io/nats.net/blob/main/examples/Example.JetStream.PullConsumer/Program.cs
1•northlondoner•22m ago•1 comments