frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Chat with Orion – a visual agent that sees, reasons and acts

https://chat.vlm.run/
22•fzysingularity•2mo ago
Hey HN! We’re excited to share Orion [1] — our new visual agent that sees, reasons, and acts across images, videos, and documents.

Frontier VLMs (GPT, Claude, Gemini) can describe what they see, but they can’t reliably act on visual inputs. Ask them to detect objects, segment images, or chain visual steps — they’ll fail in surprisingly inconsistent ways. High-res images collapse to ~1024px. And the visual AI ecosystem is fragmented across separate APIs for image understanding, OCR, image-gen, video-gen, etc.

We built Orion to fix this.

Orion combines VLM reasoning with reliable computer-vision tools inside a unified chat-completions interface. You can chain visual steps, inspect results, and treat visual tasks the same way you treat text workflows. Here’s a quick demo [2].

What Orion can do today: - Detect objects, faces, people (with precise, visualized boxes) - Segment objects or salient regions interactively - Edit, remix, and re-imagine images/videos from prompts - Summarize visual content (images or videos) - Transform images: crop, rotate, upscale - Transform videos: trim, sample, highlight scenes - Parse and structure documents: pagination, layout, OCR, extraction

One unified “chat-completions”-like interface — no juggling multiple vision APIs. Check out the tours in the chat [3] or read the announcement [4].

API access opens next week. Happy to answer any questions — otherwise, feel free to try the tours and break things!

[1] Learn more about Orion: https://vlm.run/orion

[2] Promo video: https://youtu.be/cPJN4iZz6QQ

[3] Chat: https://chat.vlm.run

[4] LinkedIn announcement: https://www.linkedin.com/posts/sudeeppillai_ai-computervisio...

Comments

aivisionperson•2mo ago
Really crazy results there. would love to test more
SoftwareManHere•2mo ago
It's really cool how good of a job it did!
hackintothings•2mo ago
I just tried out generating and editing this video it performed a pretty good results which is not possible with other chat interfaces. can you tell what is the bottleneck of this agents?
fzysingularity•2mo ago
It's still early days, but we'll expand to more capabilities very quickly given that we're not bottlenecked by training a single large VLM to do these tasks - think video tracking, in-image editing, and 3D.
Lona_Kiragu•2mo ago
The AI world just got better with Orion!
slater•2mo ago
wow, so many astro-turfed responses in this post. it must be a really good app!!

....

orm•2mo ago
The video was interesting. Seems like a nice way to start a shopping search if you have a picture with something you want where the look matters. Eg, cars, furniture. etc.
fzysingularity•2mo ago
Do you mean like creating a personalized item from another product image?
kernel33•2mo ago
I tried object segmentation and it’s really good
fzysingularity•2mo ago
Hey, thanks! Curious what you tried to test it. Segmentation models like SAM2 only gets you so far, but by make this instruction-driven with reasoning in the loop, it's remarkable what you can do these days.

Stay tuned for more updates here, tracking segments is coming soon!

Monzo wrongly denied refunds to fraud and scam victims

https://www.theguardian.com/money/2026/feb/07/monzo-natwest-hsbc-refunds-fraud-scam-fos-ombudsman
1•tablets•4m ago•0 comments

They were drawn to Korea with dreams of K-pop stardom – but then let down

https://www.bbc.com/news/articles/cvgnq9rwyqno
1•breve•6m ago•0 comments

Show HN: AI-Powered Merchant Intelligence

https://nodee.co
1•jjkirsch•9m ago•0 comments

Bash parallel tasks and error handling

https://github.com/themattrix/bash-concurrent
1•pastage•9m ago•0 comments

Let's compile Quake like it's 1997

https://fabiensanglard.net/compile_like_1997/index.html
1•billiob•10m ago•0 comments

Reverse Engineering Medium.com's Editor: How Copy, Paste, and Images Work

https://app.writtte.com/read/gP0H6W5
1•birdculture•15m ago•0 comments

Go 1.22, SQLite, and Next.js: The "Boring" Back End

https://mohammedeabdelaziz.github.io/articles/go-next-pt-2
1•mohammede•21m ago•0 comments

Laibach the Whistleblowers [video]

https://www.youtube.com/watch?v=c6Mx2mxpaCY
1•KnuthIsGod•22m ago•1 comments

Slop News - HN front page right now hallucinated as 100% AI SLOP

https://slop-news.pages.dev/slop-news
1•keepamovin•27m ago•1 comments

Economists vs. Technologists on AI

https://ideasindevelopment.substack.com/p/economists-vs-technologists-on-ai
1•econlmics•29m ago•0 comments

Life at the Edge

https://asadk.com/p/edge
2•tosh•35m ago•0 comments

RISC-V Vector Primer

https://github.com/simplex-micro/riscv-vector-primer/blob/main/index.md
3•oxxoxoxooo•38m ago•1 comments

Show HN: Invoxo – Invoicing with automatic EU VAT for cross-border services

2•InvoxoEU•39m ago•0 comments

A Tale of Two Standards, POSIX and Win32 (2005)

https://www.samba.org/samba/news/articles/low_point/tale_two_stds_os2.html
2•goranmoomin•43m ago•0 comments

Ask HN: Is the Downfall of SaaS Started?

3•throwaw12•44m ago•0 comments

Flirt: The Native Backend

https://blog.buenzli.dev/flirt-native-backend/
2•senekor•45m ago•0 comments

OpenAI's Latest Platform Targets Enterprise Customers

https://aibusiness.com/agentic-ai/openai-s-latest-platform-targets-enterprise-customers
1•myk-e•48m ago•0 comments

Goldman Sachs taps Anthropic's Claude to automate accounting, compliance roles

https://www.cnbc.com/2026/02/06/anthropic-goldman-sachs-ai-model-accounting.html
3•myk-e•50m ago•5 comments

Ai.com bought by Crypto.com founder for $70M in biggest-ever website name deal

https://www.ft.com/content/83488628-8dfd-4060-a7b0-71b1bb012785
1•1vuio0pswjnm7•51m ago•1 comments

Big Tech's AI Push Is Costing More Than the Moon Landing

https://www.wsj.com/tech/ai/ai-spending-tech-companies-compared-02b90046
4•1vuio0pswjnm7•53m ago•0 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
2•1vuio0pswjnm7•55m ago•0 comments

Suno, AI Music, and the Bad Future [video]

https://www.youtube.com/watch?v=U8dcFhF0Dlk
1•askl•57m ago•2 comments

Ask HN: How are researchers using AlphaFold in 2026?

1•jocho12•1h ago•0 comments

Running the "Reflections on Trusting Trust" Compiler

https://spawn-queue.acm.org/doi/10.1145/3786614
1•devooops•1h ago•0 comments

Watermark API – $0.01/image, 10x cheaper than Cloudinary

https://api-production-caa8.up.railway.app/docs
1•lembergs•1h ago•1 comments

Now send your marketing campaigns directly from ChatGPT

https://www.mail-o-mail.com/
1•avallark•1h ago•1 comments

Queueing Theory v2: DORA metrics, queue-of-queues, chi-alpha-beta-sigma notation

https://github.com/joelparkerhenderson/queueing-theory
1•jph•1h ago•0 comments

Show HN: Hibana – choreography-first protocol safety for Rust

https://hibanaworks.dev/
5•o8vm•1h ago•1 comments

Haniri: A live autonomous world where AI agents survive or collapse

https://www.haniri.com
1•donangrey•1h ago•1 comments

GPT-5.3-Codex System Card [pdf]

https://cdn.openai.com/pdf/23eca107-a9b1-4d2c-b156-7deb4fbc697c/GPT-5-3-Codex-System-Card-02.pdf
1•tosh•1h ago•0 comments