gallia-screenshot
UnreviewedThree capture layers let the AI see what it produces and self-correct without asking the user to screenshot manually. This is the core enabler for autonomous visual feedback loops.
name: gallia-screenshot
description: How to capture screenshots at three levels — gvShot (GV panel), tabShot (full browser tab), deskShot (full desktop). Enables AI visual feedback loops for self-verification and iterative correction.
Screenshot & Visual Feedback
Three capture layers let the AI see what it produces and self-correct without asking the user to screenshot manually. This is the core enabler for autonomous visual feedback loops.
Quick Reference
| Need to see... | Layer | Tool | Setup |
|---|---|---|---|
| Rendered content (3D, SVG, widget) | gvShot | gv_capture |
None |
| Toolbar, buttons, multi-pane layouts, nested iframes | tabShot | gv_tab_capture |
One-time user permission |
| KiCad, Fusion 360, desktop apps | deskShot | desktop_screenshot_screen/window |
Conduit app running |
Visual Feedback Loop (CRITICAL)
This is the pattern that makes AI autonomous for visual work:
1. Make change (edit HTML, generate SVG, load 3D model, add toolbar button)
2. Push to viewer (gv_display, gv_3d_display, restart server + reload)
3. Capture screenshot (choose the right layer — see table above)
4. Analyze the screenshot — does it look right?
5. If not → identify what's wrong → fix → go to step 1
Always capture after visual changes. Don't commit UI changes without verifying them visually first.
When to use which layer
- Added a toolbar button → tabShot (gvShot can't see HTML overlays)
- Loaded a 3D model → gvShot (fast, no setup, captures the canvas)
- Changed a symbol SVG → gvShot (captures rendered SVG content)
- Updated a multi-pane layout (LibView, split-pane) → tabShot (crosses iframe boundaries)
- Sent a file to KiCad → deskShot with
desktop_screenshot_window(captures desktop app) - Need to see the full IDE layout → tabShot (captures entire browser tab)
- Need to see multiple desktop windows → deskShot with
desktop_screenshot_screen
Layer 1: gvShot (gv_capture)
Scope: GV panel canvas content only (the rendered output inside iframes)
gv_capture() → PNG of whatever the GV panel is showing
- MCP tool:
gv_capture— no parameters - Saves to:
project-content/screenshots/gv/mgmt-screenshot-{timestamp}.png - Setup: None — works automatically when a viewer is connected
- Timeout: 10 seconds
How it works
MCP tool → mgmt relay (port 8772) → WebSocket broadcast to browser
→ browser tries 3 capture strategies in order:
1. postMessage to iframe → iframe serializes its own content → parent renders PNG
2. Direct SVG serialization (for SVG in main content area)
3. Direct <img> capture (fallback for image elements)
→ sends PNG back via WebSocket → saved to disk → returned to MCP
The key technique is cooperative iframe capture: the parent asks the iframe to export its content via postMessage, and the iframe voluntarily serializes its own canvas/SVG. This avoids cross-origin tainted canvas errors.
What it CAN capture
- 3D model renders (Babylon.js canvas via
canvas.toDataURL()) - Symbol/footprint SVGs (serialized to XML, re-rendered at 8x scale)
- Image content displayed in the viewer
- Any iframe that implements the
mgmt_capture_requestprotocol
What it CANNOT capture
- HTML overlays (toolbar buttons, tooltips, info bars) — these are DOM elements above the canvas
- Nested iframes (LibView 3-pane layout) — cross-iframe postMessage doesn't chain
- Content that hasn't implemented the capture protocol
Layer 2: tabShot (gv_tab_capture)
Scope: Full browser tab — IDE, editor, terminal, GV panel, overlays, nested iframes, everything visible in the tab
gv_tab_capture() → PNG of the entire browser tab
- MCP tool:
gv_tab_capture— no parameters - Saves to:
project-content/screenshots/gv/tab-capture-{timestamp}.png - Timeout: 10 seconds
How it works
A companion tab holds a persistent getDisplayMedia video stream of the shared browser tab. When gv_tab_capture is called, the server asks the companion tab to grab a single frame from the video stream.
MCP tool → main viewer API (port 8771) action: 'tab_capture'
→ server sends capture_request to companion tab via WebSocket
→ companion tab grabs frame: canvas.drawImage(video, 0, 0) → toDataURL()
→ sends frame back → server saves to disk → returns to MCP
Setup (one-time per session)
If gv_tab_capture returns an error about "no capture tab" or "not sharing", prompt the user:
- Click the camera button in GV (or press Alt+S) — opens the capture companion tab
- Click "Share This Tab" in the popup
- Select the IDE tab in the browser's sharing dialog
- Done — the stream persists even across GV reloads
Key properties
- NOT affected by foreground window changes — captures the shared tab regardless of which window is in front
- Captures everything in the tab — toolbar icons, button active/inactive states, multi-pane layouts, nested iframes
- Near-zero CPU cost when idle — the video stream only decodes frames on demand
- Survives GV restarts — the companion tab connects to the mgmt relay (port 8772), which is independent of the main GV server
When to prefer tabShot over gvShot
- You need to verify toolbar button icons, active states, or hover effects
- You're working with multi-pane layouts (LibView, split-pane views)
- gvShot returned a blank or incorrect capture (some content types aren't capturable by gvShot)
- You need to see the full context of what the user sees
Layer 3: deskShot (desktop_screenshot_screen / desktop_screenshot_window)
Scope: Full desktop or specific application window — native OS-level capture via Conduit
Tools
| Tool | What it captures | Parameters |
|---|---|---|
desktop_list_windows |
List all visible windows | None → returns [{ hwnd, title, className, rect }] |
desktop_screenshot_window |
Specific window by handle | hwnd (required), savePath (optional) |
desktop_screenshot_screen |
Entire desktop (all monitors) | savePath (optional) |
How it works
MCP tool → Conduit relay (port 8766) → WebSocket → Desktop Conduit app
→ native OS capture API (Windows: BitBlt, Mac: CoreGraphics)
→ base64 PNG → returned to MCP
Setup
- Adom Desktop Conduit app must be running on the user's machine
- Desktop client must be connected (check with
conduit_status)
Key properties
desktop_screenshot_screenIS affected by foreground window — captures whatever is on topdesktop_screenshot_windowcaptures a specific window by HWND — usedesktop_list_windowsfirst to find the handle- 15-second timeout per capture
- Cross-application — can see KiCad, Fusion 360, browser, terminal, anything on the desktop
Typical workflow: verify KiCad delivery
1. Send file to KiCad: kicad_open_board({ path: '/tmp/board.kicad_pcb' })
2. List windows: desktop_list_windows() → find KiCad's HWND
3. Capture: desktop_screenshot_window({ hwnd: 12345 })
4. Analyze → verify the board loaded correctly
Adding Capture Support to a New Iframe Widget
When creating a new iframe-based viewer widget that should be capturable by gv_capture (Layer 1):
- Add a
messageevent listener formgmt_capture_request:
window.addEventListener('message', (e) => {
const msg = typeof e.data === 'string' ? JSON.parse(e.data) : e.data;
if (msg.type === 'mgmt_capture_request') {
// Capture your content...
}
});
- For SVG content: Clone, inline dynamic styles, serialize to XML:
const xml = new XMLSerializer().serializeToString(svgClone);
window.parent.postMessage({
type: 'mgmt_canvas_capture',
_reqId: msg._reqId,
svgXml: xml,
svgWidth: width,
svgHeight: height,
bgColor: '#0d1117'
}, '*');
- For canvas content (WebGL, 2D): call
toDataURL()directly:
scene.render();
const dataUrl = canvas.toDataURL('image/png');
window.parent.postMessage({
type: 'mgmt_canvas_capture',
_reqId: msg._reqId,
data: dataUrl
}, '*');
The parent receives the data and saves it — the iframe never needs to worry about cross-origin restrictions because it's exporting its own content cooperatively.
Key Files
| File | Role |
|---|---|
~/gallia/viewer/mcp/server.js |
MCP tool definitions (gv_capture, gv_tab_capture) |
~/gallia/viewer/mgmt-server.js |
Management relay (port 8772) — routes gvShot requests |
~/gallia/viewer/server.js |
Main viewer server — handles tabShot API |
~/gallia/viewer/viewer/index.html |
Browser-side gvShot orchestration |
~/gallia/viewer/viewer/capture.html |
Screen Capture API companion tab for tabShot |
~/gallia/server/mcp/server.js |
Conduit MCP tools (desktop_screenshot_*) |