frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: WP Float – Archive WordPress blogs to free static hosting

https://wpfloat.netlify.app/
1•zizoulegrande•1m ago•0 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
1•melvinzammit•1m ago•0 comments

Sony BMG copy protection rootkit scandal

https://en.wikipedia.org/wiki/Sony_BMG_copy_protection_rootkit_scandal
1•basilikum•4m ago•0 comments

The Future of Systems

https://novlabs.ai/mission/
2•tekbog•4m ago•1 comments

NASA now allowing astronauts to bring their smartphones on space missions

https://twitter.com/NASAAdmin/status/2019259382962307393
2•gbugniot•9m ago•0 comments

Claude Code Is the Inflection Point

https://newsletter.semianalysis.com/p/claude-code-is-the-inflection-point
2•throwaw12•10m ago•1 comments

Show HN: MicroClaw – Agentic AI Assistant for Telegram, Built in Rust

https://github.com/microclaw/microclaw
1•everettjf•10m ago•2 comments

Show HN: Omni-BLAS – 4x faster matrix multiplication via Monte Carlo sampling

https://github.com/AleatorAI/OMNI-BLAS
1•LowSpecEng•11m ago•1 comments

The AI-Ready Software Developer: Conclusion – Same Game, Different Dice

https://codemanship.wordpress.com/2026/01/05/the-ai-ready-software-developer-conclusion-same-game...
1•lifeisstillgood•13m ago•0 comments

AI Agent Automates Google Stock Analysis from Financial Reports

https://pardusai.org/view/54c6646b9e273bbe103b76256a91a7f30da624062a8a6eeb16febfe403efd078
1•JasonHEIN•16m ago•0 comments

Voxtral Realtime 4B Pure C Implementation

https://github.com/antirez/voxtral.c
2•andreabat•19m ago•0 comments

I Was Trapped in Chinese Mafia Crypto Slavery [video]

https://www.youtube.com/watch?v=zOcNaWmmn0A
1•mgh2•25m ago•0 comments

U.S. CBP Reported Employee Arrests (FY2020 – FYTD)

https://www.cbp.gov/newsroom/stats/reported-employee-arrests
1•ludicrousdispla•27m ago•0 comments

Show HN: I built a free UCP checker – see if AI agents can find your store

https://ucphub.ai/ucp-store-check/
2•vladeta•32m ago•1 comments

Show HN: SVGV – A Real-Time Vector Video Format for Budget Hardware

https://github.com/thealidev/VectorVision-SVGV
1•thealidev•34m ago•0 comments

Study of 150 developers shows AI generated code no harder to maintain long term

https://www.youtube.com/watch?v=b9EbCb5A408
1•lifeisstillgood•34m ago•0 comments

Spotify now requires premium accounts for developer mode API access

https://www.neowin.net/news/spotify-now-requires-premium-accounts-for-developer-mode-api-access/
1•bundie•37m ago•0 comments

When Albert Einstein Moved to Princeton

https://twitter.com/Math_files/status/2020017485815456224
1•keepamovin•38m ago•0 comments

Agents.md as a Dark Signal

https://joshmock.com/post/2026-agents-md-as-a-dark-signal/
2•birdculture•40m ago•0 comments

System time, clocks, and their syncing in macOS

https://eclecticlight.co/2025/05/21/system-time-clocks-and-their-syncing-in-macos/
1•fanf2•41m ago•0 comments

McCLIM and 7GUIs – Part 1: The Counter

https://turtleware.eu/posts/McCLIM-and-7GUIs---Part-1-The-Counter.html
2•ramenbytes•44m ago•0 comments

So whats the next word, then? Almost-no-math intro to transformer models

https://matthias-kainer.de/blog/posts/so-whats-the-next-word-then-/
1•oesimania•45m ago•0 comments

Ed Zitron: The Hater's Guide to Microsoft

https://bsky.app/profile/edzitron.com/post/3me7ibeym2c2n
2•vintagedave•48m ago•1 comments

UK infants ill after drinking contaminated baby formula of Nestle and Danone

https://www.bbc.com/news/articles/c931rxnwn3lo
1•__natty__•49m ago•0 comments

Show HN: Android-based audio player for seniors – Homer Audio Player

https://homeraudioplayer.app
3•cinusek•49m ago•2 comments

Starter Template for Ory Kratos

https://github.com/Samuelk0nrad/docker-ory
1•samuel_0xK•51m ago•0 comments

LLMs are powerful, but enterprises are deterministic by nature

2•prateekdalal•54m ago•0 comments

Make your iPad 3 a touchscreen for your computer

https://github.com/lemonjesus/ipad-touch-screen
2•0y•59m ago•1 comments

Internationalization and Localization in the Age of Agents

https://myblog.ru/internationalization-and-localization-in-the-age-of-agents
1•xenator•1h ago•0 comments

Building a Custom Clawdbot Workflow to Automate Website Creation

https://seedance2api.org/
1•pekingzcc•1h ago•1 comments
Open in hackernews

Show HN: Inference API that adapts to your SLA and quality constraints

https://models.exosphere.host/
6•spacemnstr42069•1mo ago
Hi HN, I'm one of the creators of Exosphere. Think of us like a reliability lab for agents.

Today we are launching Exosphere Flex Inference APIs: Inference APIs should adapt to your constraints, not the other way around.

Usually, when you need to run inference at scale, you are forced into rigid boxes:

1. "Real-time" APIs (Expensive, optimized for <1s latency, prone to 429s).

2. "Batch" APIs (Cheaper, but often force 24-hour windows and rigid file formats).

3. "Self-hosted" (Total control, but high ops overhead).

We built a flexible inference engine that sits in the middle. You define the constraints—SLA (time), Cost, and Quality and the system handles the execution.

Here is how it works under the hood:

1. Flexible SLAs (The "Time" Constraint): Instead of just "now" or "tomorrow," you pass an `sla` parameter (e.g., 60 minutes, 4 hours). Our scheduler bins these requests to optimize GPU saturation across our provider mesh. You trade strict immediacy for up to ~70% lower cost.

2. Reliability Layer (The "Ops" Constraint): We abstract away the error handling. If a provider throws a 429 or 503, you shouldn't have to write a retry loop with backoff jitter. Our infrastructure absorbs these failures and retries internally. We guarantee the request eventually succeeds (within your SLA) or we don't charge you.

3. Built-in Quality Gates (The "Accuracy" Constraint): This is the feature I’m most excited about. You can define an "eval" config in the request (using LLM-as-a-Judge or python scripts). If the output doesn't meet your criteria, our system automatically feeds the failure back into the model and retries it. This moves the "validation loop" from your client code into the infrastructure.

I’d love to hear your thoughts on this approach—specifically, does moving the "retry/eval" loop into the API layer simplify your backend, or do you prefer keeping that logic client-side?

Playground: https://models.exosphere.host/

More Details: https://exosphere.host/flex-inference