Show HN: DeskSlice – controlling a VS Code agent from my phone

3•frudas24•1mo ago

DeskSlice is a small Go tool that lets you remotely view and control a VS Code AI agent from a mobile browser.

The problem I wanted to solve was very practical: I wanted to comfortably interact with a local VS Code agent (read outputs, scroll, and type prompts) from my phone, without reimplementing the UI or relying on editor internals or private APIs.

Instead of building a full remote desktop, DeskSlice streams only a calibrated slice of the desktop where the agent UI lives, and maps touch gestures back to mouse and keyboard input on the host.

I originally implemented this using WebRTC, but after hitting reliability and complexity issues (signaling, renegotiation, RTP quirks), I pivoted to MJPEG over HTTP. For LAN use, MJPEG turned out to be much simpler, easier to debug, and reliable enough for UI-driven workflows.

Key ideas: - Manual fullscreen calibration to select the exact agent panel, input area, and scroll area - Cropped video stream (not the full desktop) - Touch-first interaction model (tap, drag-scroll, typing) - No UI scraping, no state persistence — it operates the real VS Code agent UI - Simple password gate for LAN use

This is intentionally not a general-purpose remote desktop. It’s a focused control surface for interacting with a local AI agent through its existing UI.

Repo: https://github.com/frudas24/deskslice/

Comments

Sean-Der•1mo ago

Would you mind explaining the complexity issues around WebRTC more? Why did you need to do renegotiation? What RTP stuff hit you?

thanks

frudas24•1mo ago

The WebRTC complexity came from our pipeline being ffmpeg → H.264 RTP over UDP → pion/webrtc TrackLocalStaticRTP (instead of a “normal” WebRTC source). Any time we changed monitor/crop or restarted the capture, the RTP stream effectively reset (SSRC/seq/timestamps and sometimes SPS/PPS cadence), and mobile browsers can stall the decoder and just stay black. We added “restart/renegotiation” because recreating the PeerConnection is the most reliable way to recover from those discontinuities.

What we still need to debug to make WebRTC solid:

Capture-side: full ffmpeg stderr logs + exact args when it goes black. RTP ingest: log SSRC/PT/seq gaps and verify SPS/PPS are regularly re-sent (e.g., with every keyframe). WebRTC states: log signaling/ICE/connection state transitions to catch races and “remote description not set” timing. Confirm whether the black screen is a capture issue vs a decode/packetization issue (capture works via MJPEG, so likely the latter).

Sean-Der•1mo ago

Sorry you hit these issues :(

Instead of restart/renegotiation can you re-timestamp the packets? The example swap-tracks[0] shows a good way to do that. The renegotiation (especially multiple times with no real changes) is gonna be a PITA :)

Also you should share in https://pion.ly/discord other people would love to see this. Super cool project.

[0] https://github.com/pion/webrtc/blob/master/examples/swap-tra...

frudas24•1mo ago

Thanks, that’s a great suggestion :)

You’re right that re-timestamping is the proper way to avoid renegotiation, and the swap-tracks example is exactly the direction to take. In our case, monitor/crop changes usually required restarting ffmpeg, which often reset more than just timestamps (SSRC, sequence continuity, SPS/PPS timing), so renegotiation became the brute-force fallback.

That said, I’m definitely going to try your recommendation and experiment with re-timestamping / track swapping to see how far we can get without renegotiation, especially on mobile browsers. Thanks as well for the Discord link, I’ll share the project there. Appreciate the concrete pointers.

frudas24•1mo ago

fixed :)

Show HN: Engineering Perception with Combinatorial Memetics

Show HN: Steam Daily – A Wordle-like daily puzzle game for Steam fans

The Anthropic Hive Mind

Just Started Using AmpCode

LLM as an Engineer vs. a Founder?

Crosstalk inside cells helps pathogens evade drugs, study finds

Show HN: Design system generator (mood to CSS in <1 second)

Show HN: 26/02/26 – 5 songs in a day

Toroidal Logit Bias – Reduce LLM hallucinations 40% with no fine-tuning

Top AI models fail at >96% of tasks

The Science of the Perfect Second (2023)

Bob Beck (OpenBSD) on why vi should stay vi (2006)

Show HN: a glimpse into the future of eye tracking for multi-agent use

The Optima-l Situation: A deep dive into the classic humanist sans-serif

Barn Owls Know When to Wait

Implementing TCP Echo Server in Rust [video]

LicGen – Offline License Generator (CLI and Web UI)

Service Degradation in West US Region

The Janitor on Mars

Bringing Polars to .NET

Adventures in Guix Packaging

Show HN: We had 20 Claude terminals open, so we built Orcha

Your Best Thinking Is Wasted on the Wrong Decisions

Warcraftcn/UI – UI component library inspired by classic Warcraft III aesthetics

Trump Vodka Becomes Available for Pre-Orders

Velocity of Money

Stop building automations. Start running your business

You can't QA your way to the frontier

Show HN: PalettePoint – AI color palette generator from text or images

Robust and Interactable World Models in Computer Vision [video]

Show HN: Engineering Perception with Combinatorial Memetics

Show HN: Steam Daily – A Wordle-like daily puzzle game for Steam fans

The Anthropic Hive Mind

Just Started Using AmpCode

LLM as an Engineer vs. a Founder?

Crosstalk inside cells helps pathogens evade drugs, study finds

Show HN: Design system generator (mood to CSS in <1 second)

Show HN: 26/02/26 – 5 songs in a day

Toroidal Logit Bias – Reduce LLM hallucinations 40% with no fine-tuning

Top AI models fail at >96% of tasks

The Science of the Perfect Second (2023)

Bob Beck (OpenBSD) on why vi should stay vi (2006)

Show HN: a glimpse into the future of eye tracking for multi-agent use

The Optima-l Situation: A deep dive into the classic humanist sans-serif

Barn Owls Know When to Wait

Implementing TCP Echo Server in Rust [video]

LicGen – Offline License Generator (CLI and Web UI)

Service Degradation in West US Region

The Janitor on Mars

Bringing Polars to .NET

Adventures in Guix Packaging

Show HN: We had 20 Claude terminals open, so we built Orcha

Your Best Thinking Is Wasted on the Wrong Decisions

Warcraftcn/UI – UI component library inspired by classic Warcraft III aesthetics

Trump Vodka Becomes Available for Pre-Orders

Velocity of Money

Stop building automations. Start running your business

You can't QA your way to the frontier

Show HN: PalettePoint – AI color palette generator from text or images

Robust and Interactable World Models in Computer Vision [video]

Show HN: DeskSlice – controlling a VS Code agent from my phone

Comments