frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

GPT-OSS-20B-Vision: First Community VLM for GPT-OSS, Trained on a DGX Spark

https://huggingface.co/vincentkaufmann/gpt-oss-20b-vision-preview
3•vkaufmann•2h ago

Comments

vkaufmann•2h ago
GPT-OSS-20B-Vision: First community VLM for GPT-OSS, trained on a single DGX Spark

A couple weeks ago I shipped an MCP server (noapi-google-search-mcp) and people in the community challenged me to do something harder - build a VLM. So I bought a DGX Spark, flew to Dubai, and built the first vision-language model for GPT-OSS from a hotel room. Just a Spark, hotel WiFi and stubbornness.

This is an early proof of concept at 22% training - shipped it to show what's possible and to find compute partners to finish the job.

What it does: Adds vision to GPT-OSS-20B. Takes an image + text prompt, generates coherent descriptions. Identifies objects, scenes, spatial relationships. Vision was trained directly into the model through QLoRA adaptation - the LLM learned to see, not just pass through visual tokens. All original text capabilities are fully preserved. Hallucinations present - expected at this training stage.

How it works: A SigLIP vision encoder feeds into the 20B MoE language model through a method I call PseudoDeepStack - extracting visual features from multiple encoder depths instead of just the final layer. Richer visual representations at zero additional inference cost.

Key finding: Projector-only training (the standard approach for dense VLMs) fails completely on MoE architectures. The expert routing can't handle visual tokens it's never seen. QLoRA adaptation solves this.

The setup: Single NVIDIA DGX Spark GB10, hotel room in Dubai, Domino's pizza. No cluster, no team. ~3.5 days of training to this checkpoint.

What's next: Finishing training with new hyperparameters based on what we learned from this run, scaling to GPT-OSS-120B (same projector works - shared hidden dimensions), benchmarking. Need compute to get there.

Model + code + full model card: https://huggingface.co/vincentkaufmann/gpt-oss-20b-vision-pr...

Why Most AI Agent Directories Are Basically Useless

https://www.agentrank.tech/blog/why-most-ai-agent-directories-suck
1•hughmcinnis•47s ago•0 comments

A lightweight Windows tool for surfacing unusual system activity

1•EricAUS•2m ago•0 comments

Show HN: Claude-Nonstop – Auto Account Switching and Slack Remote in Claude Code

https://github.com/rchaz/claude-nonstop
1•rchaz•3m ago•1 comments

AMC Theatres Will Refuse to Screen AI Short Film After Online Uproar

https://www.hollywoodreporter.com/movies/movie-news/ai-short-movie-amc-theaters-1236509143/
1•mikhael•3m ago•0 comments

Dishonest People Self-Select into Public Service (In China) [pdf]

https://bfi.uchicago.edu/wp-content/uploads/2026/01/BFI_WP_2026-17.pdf
1•marojejian•4m ago•1 comments

Gemini Pro 3.1's Sage Take on HN and YC

https://gist.github.com/crisdosaygo/1df53af43874192a516a257f5bf06b93
2•keepamovin•7m ago•1 comments

Content Security Policy (CSP)

https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/CSP
1•doener•12m ago•0 comments

Show HN: Building an e-commerce MVP in in 5 prompts using rdd

https://medium.com/@leechanchai/i-built-an-e-commerce-mvp-in-5-prompts-with-rdd-requirement-drive...
1•cclth•13m ago•1 comments

Fighting the Intel Management Engine with a Cheap System76 Laptop

https://matthewsigmond.com/posts/blog/galago/
1•matthew28845•14m ago•0 comments

OWASP Top Ten Web Application Security Risks

https://owasp.org/www-project-top-ten/
1•doener•16m ago•0 comments

Ask HN: Does treating Inflation as a "Quantization Snap" resolve slow-roll?

1•aplowe•19m ago•0 comments

Grocy "Home Management"

https://github.com/grocy/grocy
2•sourcegrift•20m ago•1 comments

Tracekit: Find what your AI coding agent wastes money on and fix it

https://github.com/0xKoda/tracekit
1•handfuloflight•21m ago•0 comments

Pardoned Binance Founder Hobnobs with Trump Sons, at Mar-a-Lago Crypto Fest

https://www.wsj.com/politics/policy/pardoned-binance-founder-hobnobs-with-trump-sons-administrati...
5•Betelbuddy•22m ago•0 comments

Quod 64kb FPS and deep dive video

https://daivuk.itch.io/quod
1•tonym128•27m ago•2 comments

Stripe closed our account over upsell transaction architecture, not fraud

2•JohannesSchip•29m ago•0 comments

PEP 747 – Annotating Type Forms

https://peps.python.org/pep-0747/
2•azhenley•29m ago•0 comments

Ring's Founder Knows You Hated That Super Bowl Ad

https://www.nytimes.com/2026/02/19/business/ring-super-bowl-ad-privacy.html
1•jhonovich•29m ago•0 comments

Ask HN: Why does it feel like qualifications are irrelevant to hirers?

1•hdhdhsjsbdh•30m ago•2 comments

Use digests, not tags, in your Dockerfiles

https://interrupt.sh/blog/dockerfile-tags/
2•arwt•33m ago•0 comments

Show HN: Mercury CLI – CLI to Connect to Mercury Bank

https://www.npmjs.com/package/mercury-cli
1•ex3ndr•33m ago•0 comments

HHS Releases 6 Years of Medicaid Claims Data ($1T)

https://opendata.hhs.gov/
1•dnw•35m ago•0 comments

Brewing possibilities: Using caffeine to edit gene expression

https://phys.org/news/2026-01-brewing-possibilities-caffeine-gene.html
1•PaulHoule•35m ago•0 comments

Optimism as Resilience and Resistance

https://gathernomoss.substack.com/p/optimism-as-resilience-and-resistance
1•insidiouspaul•37m ago•0 comments

Expanding our long-running agents research preview · Cursor

https://cursor.com/blog/long-running-agents
1•zachdotai•37m ago•0 comments

Piracy Is Only Illegal for You – Nvidia Sued for Alleged Theft in AI Training [video]

https://www.youtube.com/watch?v=Sdry-clMeRs
4•givemeethekeys•38m ago•0 comments

GitLab Threat Intelligence Team Reveals North Korean Tradecraft

https://about.gitlab.com/blog/gitlab-threat-intelligence-reveals-north-korean-tradecraft/
2•tachyons•38m ago•0 comments

A $10K+ bounty is waiting for anyone who can unplug Ring doorbells from Amazon

https://www.theverge.com/tech/881678/ring-doorbell-bounty-amazon-servers-fulu
3•october8140•40m ago•0 comments

Flickr's URLs Scheme

https://unsung.aresluna.org/unsung-heroes-flickrs-urls-scheme/
3•colinprince•43m ago•0 comments

AI Agent Harness for ClickHouse

https://clickhouse.com/blog/ai-powered-migraiton-from-postgres-to-clickhouse-with-fiveonefour
1•chriscrane•43m ago•0 comments