skill / adom-screenshot
!

Not installable via adompkg

This skill has no published release. adompkg install kyle/adom-screenshot will not work until a maintainer publishes a tarball with install.sh and uninstall.sh.

See the publishing docs for the package.json schema and tarball layout required to ship this skill.


name: adom-screenshot
description: How to capture screenshots -- Hydrogen panels/workspace/screen (primary), Adom Desktop windows (native apps), and legacy AV panel tools. Enables AI visual feedback loops. EVERY screenshot MUST be logged to shotlog immediately after capture (see "Always inject to shotlog" below). This is non-negotiable for ralph loops, walkthroughs, debug sessions, and one-off verifications alike. Trigger words -- screenshot, screencap, capture panel, hydrogen screenshot, browser screenshot, ralph loop, visual verification, debug loop, walkthrough proof, visual feedback.

Screenshot & Visual Feedback

⚡ Always inject to shotlog — non-negotiable

After EVERY screenshot, before moving on, run shotlog inject -c <channel> -d "<specific description>" -s <source> <file>. No exceptions — not for "quick tests," not for "one-off verifications," not for ralph-loops that are mostly for chat preview. The shotlog is the durable artifact; ephemeral chat scrollback is not.

Why:

  • The shotlog webview is how humans review visual history across a session. Screenshots not in shotlog are effectively lost the moment the chat scrolls.
  • Slugified descriptions in filenames become the searchable index for "what did we see at step 3 of the walkthrough glow proof?"
  • This rule is what distinguishes a repeatable visual record from a one-off preview.

How to apply:

  • Every single adom-cli hydrogen screenshot panel|workspace|screen call is immediately followed by a shotlog inject call.
  • Every browser_screenshot (pup) call is immediately followed by a shotlog inject call.
  • Every av_capture / av_tab_capture call (legacy; avoid) is immediately followed by a shotlog inject call.
  • Pick a channel name tied to the task (bme680-walkthrough-glows, rp2040-usbc-rotate). Reuse across related shots in the same session.
  • Description is SPECIFIC (step6-testpoints-halo-bloom, not shot6). It becomes the slugified filename; future-you reads these to reconstruct what happened.
  • -s <source> = origin tag (adom-cli, pup_screenshot, av_capture, manual).
  • In a ralph loop, inject every frame of the loop, not just the final one.

Failure mode to avoid: Running a 6-step ralph loop with 6 hydrogen screenshot calls and 0 shotlog inject calls. This has happened multiple times; each time the user had to call it out. Don't repeat.

DEFAULT: Use Hydrogen Screenshots

Your primary screenshot tool is adom-cli hydrogen screenshot. It uses the browser's element-capture API for pixel-perfect screenshots of any hydrogen panel, the full workspace, or the entire screen. It captures everything: WebGL, canvas, CSS filters, nested iframes, video elements.

Priority order:

  1. hydrogen screenshot panel -- single panel (fastest, most efficient)
  2. hydrogen screenshot workspace -- all panels side by side
  3. hydrogen screenshot screen -- entire display
  4. Adom Desktop desktop_screenshot_window -- native apps, background windows
  5. Adom Desktop desktop_screenshot_screen -- full desktop with all apps

av_tab_capture is DEPRECATED — do not use it. It requires manual setup (camera button + sharing dialog) and is unreliable. Use hydrogen screenshot instead for all full-tab captures.

av_capture is still valid for AV viewer panel canvas content specifically (3D renders, SVG symbols). It is not the default.

Quick Reference

Need to see... Tool Setup
A single hydrogen panel hydrogen screenshot panel --panel-id <id> Screen sharing (monitor icon)
All hydrogen panels hydrogen screenshot workspace Screen sharing
Full display hydrogen screenshot screen Screen sharing ("Share entire screen")
KiCad, Fusion 360, native apps desktop_screenshot_window via adom-desktop CLI Adom Desktop app
Background window desktop_screenshot_window (captures by HWND) Adom Desktop app
Full desktop with all apps desktop_screenshot_screen via adom-desktop CLI Adom Desktop app
AV viewer panel content av_capture (legacy) AV connected

Hydrogen Editor Screenshots

Screen Sharing Setup (one-time per session)

Required for hydrogen screenshots to work.

  1. Click the monitor icon in the hydrogen nav bar (top right)
  2. A browser dialog appears:
    • "Share this tab" -- enables panel + workspace scopes (recommended)
    • "Share entire screen" -- enables all scopes including screen
  3. Persists for the session -- only needs to be done once

If 504 timeout: Browser is connected but the sharing approval dialog wasn't addressed within ~90s (user missed it, dismissed it, or is AFK). Retry with a clearer --reason so the user can identify this session, or tell them to click the monitor icon in the nav bar to pre-approve.

If 409 "No browser session is connected": Editor tab is closed or SSE is severed. Ask the user to surface the Hydrogen tab — E3 auto-reconnect re-registers within seconds on visibilitychange. Verify with adom-cli hydrogen probe, then retry.

Commands

# Get panel IDs
adom-cli hydrogen workspace get

# Single panel (fastest -- use this by default)
adom-cli hydrogen screenshot panel --panel-id <leaf-id>

# Specific tab (auto-discovers pane, switches to it)
adom-cli hydrogen screenshot panel --tab-id <tab-id>

# All panels side by side
adom-cli hydrogen screenshot workspace

# Entire display
adom-cli hydrogen screenshot screen

# Check sharing status
adom-cli hydrogen screenshot status

# Request sharing approval
adom-cli hydrogen sharing request --share screen --audio --reason "Need to capture demo"

Files saved automatically to ~/project/screenshots/. Prints the file path on success. Each leaf.id from workspace get is a panelId. Each leaf.tabs[i].id is a tabId.


Adom Desktop Screenshots (native apps, background windows)

For native desktop applications and windows that hydrogen can't see. Requires the Adom Desktop app running on the user's machine.

# Check desktop is connected
adom-desktop ping

# List all windows
adom-desktop desktop_list_windows

# Screenshot a specific window (even background windows)
adom-desktop desktop_screenshot_window \
  '{"hwnd":<hwnd>,"savePath":"project-content/screenshots/desktop-debug.png"}'

# Screenshot entire desktop
adom-desktop desktop_screenshot_screen \
  '{"savePath":"project-content/screenshots/desktop-full.png"}'

# Bring a window to foreground first
adom-desktop browser_focus_window '{"sessionId":"debug"}'

Key properties

  • desktop_screenshot_window captures by HWND -- works even if the window is behind other apps
  • desktop_screenshot_screen captures whatever is visible on screen
  • 15-second timeout per capture
  • Cross-application: KiCad, Fusion 360, Chrome, terminal, anything on the desktop

Typical workflow: verify KiCad delivery

1. Send file to KiCad: kicad_open_board(...)
2. List windows: desktop_list_windows → find KiCad's HWND
3. Capture: desktop_screenshot_window({ hwnd: 12345, savePath: "..." })
4. Analyze → verify the board loaded correctly

Visual Feedback Loop

This is the pattern that makes AI autonomous for visual work. Steps 4 and 5 are a pair — never do 4 without 5. A screenshot that isn't in shotlog isn't really a captured artifact; it's a chat-message flash that scrolls out of reach in minutes.

1. Edit code
2. Refresh the debug surface (webview refresh, pup reload, server restart)
3. Interact -- send commands to exercise the change (rotate camera, click buttons, toggle states)
4. Screenshot (choose the right method from the table above)
5. shotlog inject with a SPECIFIC description   ← always, no exceptions
6. Analyze the screenshot -- does it look right?
7. If not right --> identify what's wrong --> fix --> go to step 1

Always capture after visual changes, always inject into shotlog. Don't commit UI changes without verifying them visually first. Don't run a ralph-loop that takes N screenshots and injects zero of them — that's a loop that leaves no trace.

Finding exact pixel positions of UI elements

When you need to annotate screenshots or overlay mockups on real UI, DON'T guess pixel positions. Two techniques:

1. Query the DOM for exact coordinates:

// Use hydrogen sandbox eval or browser_eval to get element position
document.querySelector('.screen-share-button').getBoundingClientRect()
// -> {x: 1342, y: 58, width: 32, height: 32}

Exact pixel coordinates, no guessing. Works for any HTML element.

2. Inject highlight borders then screenshot:

# Screenshot 1: base (no highlights)
adom-cli hydrogen screenshot screen -o /tmp/base.png

# Inject CSS borders on target elements
adom-desktop browser_eval '{"sessionId":"...","expr":"document.querySelector(\".target\").style.border=\"3px solid red\""}'

# Screenshot 2: with highlights
adom-cli hydrogen screenshot screen -o /tmp/highlighted.png

# Compare the two to see exactly where elements are

Never guess pixel positions and iterate manually. Query the DOM first. If you can't query (e.g., browser chrome elements), use the inject+compare technique.

When to use which method

  • Content in a webview panel --> hydrogen screenshot panel
  • Multiple panels, layout verification --> hydrogen screenshot workspace
  • Full workspace with nav bar --> hydrogen screenshot screen
  • Pup browser session --> browser_screenshot via adom-desktop CLI
  • KiCad, Fusion 360, native apps --> desktop_screenshot_window via adom-desktop CLI
  • Background window --> desktop_screenshot_window (captures by HWND)
  • Full desktop --> desktop_screenshot_screen via adom-desktop CLI

AV Panel: avShot (av_capture)

Scope: AV viewer panel canvas content only (the rendered output inside iframes). Use only for content displayed in the Adom Viewer panel specifically.

av_capture()  -->  PNG of whatever the AV panel is showing
  • MCP tool: av_capture -- no parameters
  • Saves to: project-content/screenshots/av/mgmt-screenshot-{timestamp}.png
  • Setup: None -- works automatically when a viewer is connected
  • Timeout: 10 seconds

How it works

MCP tool --> mgmt relay (port 8772) --> WebSocket broadcast to browser
--> browser runs captureOVScreenshot() with fallback strategies:
  1.  postMessage to 3D/canvas iframe --> iframe renders canvas --> PNG
  2.  Direct SVG serialization (for SVG content)
  3.  Direct <img> capture (for image elements)
--> sends PNG back via WebSocket --> saved to disk --> returned to MCP

For 3D/canvas iframes, cooperative iframe capture is used: the parent asks the iframe to export its canvas via postMessage, and the iframe serializes its own content.

What it CAN capture

  • 3D model renders (Babylon.js canvas via canvas.toDataURL())
  • Symbol/footprint SVGs (serialized to XML, re-rendered at 8x scale)
  • Image content displayed in the viewer
  • Parent-rendered DOM content

What it CANNOT capture

  • HTML overlays (toolbar buttons, tooltips) -- use hydrogen screenshot instead
  • Nested iframes (LibView 3-pane layout) -- use hydrogen screenshot instead
  • <canvas> inside widgets without a mgmt_capture_request handler

AV Panel: tabShot (av_tab_capture)

Scope: Full browser tab -- everything visible in the tab including AV panel, overlays, nested iframes.

av_tab_capture()  -->  PNG of the entire browser tab
  • MCP tool: av_tab_capture -- no parameters
  • Saves to: project-content/screenshots/av/tab-capture-{timestamp}.png
  • Timeout: 10 seconds

Setup (one-time per session)

  1. Click the camera button in AV (or press Alt+S)
  2. Click "Share This Tab" in the popup
  3. Select the IDE tab in the browser's sharing dialog

Key properties

  • Captures everything in the tab -- toolbar icons, button states, multi-pane layouts, nested iframes
  • NOT affected by foreground window changes
  • Near-zero CPU cost when idle
  • Survives AV restarts

Adding Capture Support to AV Widgets

Canvas-based iframe viewers (3D, WebGL, charts)

If your AV viewer renders to <canvas>, implement the mgmt_capture_request protocol:

window.addEventListener('message', (e) => {
  const msg = typeof e.data === 'string' ? JSON.parse(e.data) : e.data;
  if (msg.type === 'mgmt_capture_request') {
    scene.render(); // force render if needed
    const dataUrl = canvas.toDataURL('image/png');
    window.parent.postMessage({
      type: 'mgmt_canvas_capture',
      _reqId: msg._reqId,
      data: dataUrl
    }, '*');
  }
});

For SVG content, serialize to XML:

const xml = new XMLSerializer().serializeToString(svgClone);
window.parent.postMessage({
  type: 'mgmt_canvas_capture',
  _reqId: msg._reqId,
  svgXml: xml, svgWidth: width, svgHeight: height, bgColor: '#0d1117'
}, '*');

Note: Do not use html2canvas for capture wiring. It is deprecated -- it doesn't work with WebGL, 3D content, or cross-iframe scenarios. Use the cooperative mgmt_capture_request protocol instead.


Image Resizing for Claude

All screenshot methods auto-resize images to <=1568px on the longest edge before returning them. Larger images provide zero quality benefit.

Source Where resize happens
Hydrogen screenshots Server-side in the hydrogen API
Pup browser_screenshot adom-desktop CLI, defaults to maxWidth: 1568
Adom Desktop screenshots MCP server via sharp
AV avShot/tabShot MCP server via sharp
  • Full-resolution images are still saved to disk when savePath is specified
  • Resize preserves aspect ratio using Lanczos resampling
  • If resize fails, the original image is returned as fallback

For manual resizing:

shotlog resize large-image.png              # Resize in-place to max 1400px
shotlog resize large-image.png -o small.png # Resize to new file

When the user pastes a screenshot directly into Claude chat

You can SEE pasted images via the multimodal channel, but you CANNOT extract
them to a file path on disk. There is no /tmp/... or other location where
the bytes land. So if you need that exact image as a file -- to upload to a
GitHub issue, attach to a wiki page, commit to a repo, share in Slack -- you
have to bridge it through shotlog.

HARD RULE: never substitute a fresh adom-cli hydrogen screenshot ...
capture for the user's pasted image.
It loses the essence of what they
showed you (the specific moment, state, scroll position, error popup, tab
overflow, mouse cursor, highlight, etc.) and the user will (rightly) push
back. A "similar" screenshot is not the screenshot.

The bridge: shotlog paste flow

When you need the user's pasted image as a real file:

  1. Make sure shotlog serve is running (shotlog health)
  2. Open the shotlog webview on a non-VS-Code pane (see
    adom-workspace-control -- never park new tabs on the VS Code leaf):
    shotlog open --channel <task-channel>
    
  3. Tell the user explicitly: "Paste your screenshot (Ctrl+V) into the
    Shotlog tab I just opened." Don't assume they know -- name the tab,
    name the keystroke.
  4. Wait for paste. The shotlog webview auto-injects with a clipboard
    source, resizes to <=1400px, and writes a file under
    ~/project/screenshots/shotlog/<channel>/.
  5. Find the new file -- newest entry in the channel folder, or check
    shotlog server logs.
  6. Use that file path however you need (gh issue asset upload, wiki
    submission, embed, etc.).

Why not just ask the user to save the file themselves?

You can, but it's worse UX. They'd need to find the clipboard image, save
it somewhere, tell you the path. Shotlog is one Ctrl+V in a tab you've
already opened for them, and the file lands in a known location with a
known channel.

When the workspace API is in 409 "Editor is not open" state

Sometimes adom-cli hydrogen workspace get returns 409. You can still
call shotlog open -- it has its own panel-targeting logic -- but you
must verify afterward (adom-cli hydrogen workspace tabs once the editor
session reconnects, or visually with the user) that the Shotlog tab did
not land on the VS Code pane. If it did, move it.


Key Files

File Role
~/gallia/viewer/mcp/server.js MCP tool definitions (av_capture, av_tab_capture) + image resize
~/gallia/viewer/mgmt-server.js Management relay (port 8772) -- routes avShot requests
~/gallia/viewer/server.js Main viewer server -- handles tabShot API
~/gallia/viewer/viewer/index.html Browser-side avShot orchestration
~/gallia/viewer/viewer/capture.html Screen Capture API companion tab for tabShot
~/gallia/server/mcp/server.js Conduit MCP tools (desktop_screenshot_*) + image resize