skill / gallia-screenshot
!

Not installable via adompkg

This skill has no published release. adompkg install kyle/gallia-screenshot will not work until a maintainer publishes a tarball with install.sh and uninstall.sh.

See the publishing docs for the package.json schema and tarball layout required to ship this skill.


name: gallia-screenshot
description: How to capture screenshots at three levels — gvShot (GV panel), tabShot (full browser tab), deskShot (full desktop). Enables AI visual feedback loops for self-verification and iterative correction.

Screenshot & Visual Feedback

Three capture layers let the AI see what it produces and self-correct without asking the user to screenshot manually. This is the core enabler for autonomous visual feedback loops.

Quick Reference

Need to see... Layer Tool Setup
Rendered content (3D, SVG, widget) gvShot gv_capture None
Toolbar, buttons, multi-pane layouts, nested iframes tabShot gv_tab_capture One-time user permission
KiCad, Fusion 360, desktop apps deskShot desktop_screenshot_screen/window Conduit app running

Visual Feedback Loop (CRITICAL)

This is the pattern that makes AI autonomous for visual work:

1. Make change (edit HTML, generate SVG, load 3D model, add toolbar button)
2. Push to viewer (gv_display, gv_3d_display, restart server + reload)
3. Capture screenshot (choose the right layer — see table above)
4. Analyze the screenshot — does it look right?
5. If not → identify what's wrong → fix → go to step 1

Always capture after visual changes. Don't commit UI changes without verifying them visually first.

When to use which layer

  • Added a toolbar button → tabShot (gvShot can't see HTML overlays)
  • Loaded a 3D model → gvShot (fast, no setup, captures the canvas)
  • Changed a symbol SVG → gvShot (captures rendered SVG content)
  • Updated a multi-pane layout (LibView, split-pane) → tabShot (crosses iframe boundaries)
  • Sent a file to KiCad → deskShot with desktop_screenshot_window (captures desktop app)
  • Need to see the full IDE layout → tabShot (captures entire browser tab)
  • Need to see multiple desktop windows → deskShot with desktop_screenshot_screen

Layer 1: gvShot (gv_capture)

Scope: GV panel canvas content only (the rendered output inside iframes)

gv_capture()  →  PNG of whatever the GV panel is showing
  • MCP tool: gv_capture — no parameters
  • Saves to: project-content/screenshots/gv/mgmt-screenshot-{timestamp}.png
  • Setup: None — works automatically when a viewer is connected
  • Timeout: 10 seconds

How it works

MCP tool → mgmt relay (port 8772) → WebSocket broadcast to browser
→ browser tries 3 capture strategies in order:
  1. postMessage to iframe → iframe serializes its own content → parent renders PNG
  2. Direct SVG serialization (for SVG in main content area)
  3. Direct <img> capture (fallback for image elements)
→ sends PNG back via WebSocket → saved to disk → returned to MCP

The key technique is cooperative iframe capture: the parent asks the iframe to export its content via postMessage, and the iframe voluntarily serializes its own canvas/SVG. This avoids cross-origin tainted canvas errors.

What it CAN capture

  • 3D model renders (Babylon.js canvas via canvas.toDataURL())
  • Symbol/footprint SVGs (serialized to XML, re-rendered at 8x scale)
  • Image content displayed in the viewer
  • Any iframe that implements the mgmt_capture_request protocol

What it CANNOT capture

  • HTML overlays (toolbar buttons, tooltips, info bars) — these are DOM elements above the canvas
  • Nested iframes (LibView 3-pane layout) — cross-iframe postMessage doesn't chain
  • Content that hasn't implemented the capture protocol

Layer 2: tabShot (gv_tab_capture)

Scope: Full browser tab — IDE, editor, terminal, GV panel, overlays, nested iframes, everything visible in the tab

gv_tab_capture()  →  PNG of the entire browser tab
  • MCP tool: gv_tab_capture — no parameters
  • Saves to: project-content/screenshots/gv/tab-capture-{timestamp}.png
  • Timeout: 10 seconds

How it works

A companion tab holds a persistent getDisplayMedia video stream of the shared browser tab. When gv_tab_capture is called, the server asks the companion tab to grab a single frame from the video stream.

MCP tool → main viewer API (port 8771) action: 'tab_capture'
→ server sends capture_request to companion tab via WebSocket
→ companion tab grabs frame: canvas.drawImage(video, 0, 0) → toDataURL()
→ sends frame back → server saves to disk → returns to MCP

Setup (one-time per session)

If gv_tab_capture returns an error about "no capture tab" or "not sharing", prompt the user:

  1. Click the camera button in GV (or press Alt+S) — opens the capture companion tab
  2. Click "Share This Tab" in the popup
  3. Select the IDE tab in the browser's sharing dialog
  4. Done — the stream persists even across GV reloads

Key properties

  • NOT affected by foreground window changes — captures the shared tab regardless of which window is in front
  • Captures everything in the tab — toolbar icons, button active/inactive states, multi-pane layouts, nested iframes
  • Near-zero CPU cost when idle — the video stream only decodes frames on demand
  • Survives GV restarts — the companion tab connects to the mgmt relay (port 8772), which is independent of the main GV server

When to prefer tabShot over gvShot

  • You need to verify toolbar button icons, active states, or hover effects
  • You're working with multi-pane layouts (LibView, split-pane views)
  • gvShot returned a blank or incorrect capture (some content types aren't capturable by gvShot)
  • You need to see the full context of what the user sees

Layer 3: deskShot (desktop_screenshot_screen / desktop_screenshot_window)

Scope: Full desktop or specific application window — native OS-level capture via Conduit

Tools

Tool What it captures Parameters
desktop_list_windows List all visible windows None → returns [{ hwnd, title, className, rect }]
desktop_screenshot_window Specific window by handle hwnd (required), savePath (optional)
desktop_screenshot_screen Entire desktop (all monitors) savePath (optional)

How it works

MCP tool → Conduit relay (port 8766) → WebSocket → Desktop Conduit app
→ native OS capture API (Windows: BitBlt, Mac: CoreGraphics)
→ base64 PNG → returned to MCP

Setup

  • Adom Desktop Conduit app must be running on the user's machine
  • Desktop client must be connected (check with conduit_status)

Key properties

  • desktop_screenshot_screen IS affected by foreground window — captures whatever is on top
  • desktop_screenshot_window captures a specific window by HWND — use desktop_list_windows first to find the handle
  • 15-second timeout per capture
  • Cross-application — can see KiCad, Fusion 360, browser, terminal, anything on the desktop

Typical workflow: verify KiCad delivery

1. Send file to KiCad: kicad_open_board({ path: '/tmp/board.kicad_pcb' })
2. List windows: desktop_list_windows() → find KiCad's HWND
3. Capture: desktop_screenshot_window({ hwnd: 12345 })
4. Analyze → verify the board loaded correctly

Adding Capture Support to a New Iframe Widget

When creating a new iframe-based viewer widget that should be capturable by gv_capture (Layer 1):

  1. Add a message event listener for mgmt_capture_request:
window.addEventListener('message', (e) => {
  const msg = typeof e.data === 'string' ? JSON.parse(e.data) : e.data;
  if (msg.type === 'mgmt_capture_request') {
    // Capture your content...
  }
});
  1. For SVG content: Clone, inline dynamic styles, serialize to XML:
const xml = new XMLSerializer().serializeToString(svgClone);
window.parent.postMessage({
  type: 'mgmt_canvas_capture',
  _reqId: msg._reqId,
  svgXml: xml,
  svgWidth: width,
  svgHeight: height,
  bgColor: '#0d1117'
}, '*');
  1. For canvas content (WebGL, 2D): call toDataURL() directly:
scene.render();
const dataUrl = canvas.toDataURL('image/png');
window.parent.postMessage({
  type: 'mgmt_canvas_capture',
  _reqId: msg._reqId,
  data: dataUrl
}, '*');

The parent receives the data and saves it — the iframe never needs to worry about cross-origin restrictions because it's exporting its own content cooperatively.

Key Files

File Role
~/gallia/viewer/mcp/server.js MCP tool definitions (gv_capture, gv_tab_capture)
~/gallia/viewer/mgmt-server.js Management relay (port 8772) — routes gvShot requests
~/gallia/viewer/server.js Main viewer server — handles tabShot API
~/gallia/viewer/viewer/index.html Browser-side gvShot orchestration
~/gallia/viewer/viewer/capture.html Screen Capture API companion tab for tabShot
~/gallia/server/mcp/server.js Conduit MCP tools (desktop_screenshot_*)