The problem I wanted to solve was very practical: I wanted to comfortably interact with a local VS Code agent (read outputs, scroll, and type prompts) from my phone, without reimplementing the UI or relying on editor internals or private APIs.
Instead of building a full remote desktop, DeskSlice streams only a calibrated slice of the desktop where the agent UI lives, and maps touch gestures back to mouse and keyboard input on the host.
I originally implemented this using WebRTC, but after hitting reliability and complexity issues (signaling, renegotiation, RTP quirks), I pivoted to MJPEG over HTTP. For LAN use, MJPEG turned out to be much simpler, easier to debug, and reliable enough for UI-driven workflows.
Key ideas: - Manual fullscreen calibration to select the exact agent panel, input area, and scroll area - Cropped video stream (not the full desktop) - Touch-first interaction model (tap, drag-scroll, typing) - No UI scraping, no state persistence — it operates the real VS Code agent UI - Simple password gate for LAN use
This is intentionally not a general-purpose remote desktop. It’s a focused control surface for interacting with a local AI agent through its existing UI.
Sean-Der•1d ago
thanks
frudas24•1d ago
What we still need to debug to make WebRTC solid:
Capture-side: full ffmpeg stderr logs + exact args when it goes black. RTP ingest: log SSRC/PT/seq gaps and verify SPS/PPS are regularly re-sent (e.g., with every keyframe). WebRTC states: log signaling/ICE/connection state transitions to catch races and “remote description not set” timing. Confirm whether the black screen is a capture issue vs a decode/packetization issue (capture works via MJPEG, so likely the latter).
Sean-Der•1d ago
Instead of restart/renegotiation can you re-timestamp the packets? The example swap-tracks[0] shows a good way to do that. The renegotiation (especially multiple times with no real changes) is gonna be a PITA :)
Also you should share in https://pion.ly/discord other people would love to see this. Super cool project.
[0] https://github.com/pion/webrtc/blob/master/examples/swap-tra...
frudas24•1d ago
You’re right that re-timestamping is the proper way to avoid renegotiation, and the swap-tracks example is exactly the direction to take. In our case, monitor/crop changes usually required restarting ffmpeg, which often reset more than just timestamps (SSRC, sequence continuity, SPS/PPS timing), so renegotiation became the brute-force fallback.
That said, I’m definitely going to try your recommendation and experiment with re-timestamping / track swapping to see how far we can get without renegotiation, especially on mobile browsers. Thanks as well for the Discord link, I’ll share the project there. Appreciate the concrete pointers.
frudas24•18h ago