Overview Files Versions Dependencies Discussions Activity

Raw

name: demo-recording
description: >
Use when the user wants to record a demo, make a video, create a walkthrough,
record their screen, or capture a demo of a feature. Handles the full
pre-production and recording workflow: writing the demo script, generating
captions, verifying the setup, choreographing panels, and driving the
recording. Hands off to video-post for post-production (speedup, voiceover,
publish). Trigger words: record a video, record a demo, make a movie, create
a walkthrough, record my screen, demo video, screen recording, narrate a
walkthrough, capture a demo, make a demo.

Demo Recording

Record polished demo videos of Adom features. This skill handles everything
before and during the recording. For post-production (speedup, voiceover,
publish to wiki), hand off to the video-post skill.

What "a real demo" means at Adom — non-negotiable shape

When the user says "make a real demo," "make me a demo," "publish a demo,"
"polished tour," or anything similar that is NOT explicitly the quick path
in Step 1, every one of these steps has to ship — none of them is optional:

Sectioned script in the repo at <repo>/demo/<name>-demo-script.md
with a row-per-scene table containing caption + narration + target
length + driver actions. The narration column is phoneticized for TTS
per tts-pronunciation.
One short clip per section, recorded separately — never a single long
take covering every feature. Per-clip lengths track narration ±15%.
Per-clip TTS via adom-tts say (uses Adom's pronunciation cache + house
voice en-US-AndrewMultilingualNeural). Falls back to edge-tts only
when adom-tts isn't installed.
Per-clip mux — TTS audio combined with the silent recording into one
<id>-narrated.webm (ffmpeg -i clip.webm -i tts.mp3 ... libopus).
video-post manifest add for every clip with a --description that
states intent (not just feature name) — see video-post skill rules.
video-post storyboard <manifest> review with the user — MANDATORY
gate before concat. Per-clip review catches "clip 5 measured the wrong
thing" in seconds; reviewing a 3-min concat blind takes 20× as long.
Skip this step ONLY if the user has explicitly told you they're
unavailable (e.g. "I'm at the gym, just ship it") — and call it out
prominently in the final report so they can re-review later.
ffmpeg -f concat -safe 0 -c copy to the final — video-post concat
currently drops audio (known bug, documented in video-post skill).
Hero image — ffmpeg -ss <t> -i final.webm -frames:v 1 hero.png
from a representative frame.
Upload BOTH to the wiki — adom-wiki asset upload <page> --asset-type video --file final.webm AND --asset-type hero-image --file hero.png. Re-publish the page so pub_version bumps.
Demo script published to the wiki too so other Adom users can
reproduce — adom-wiki page edit adds the script content under the
"Demo" section of the page body.

If any of those steps gets dropped, the output is not a "real demo" — it's a
screen capture. Tell the user explicitly which steps you skipped and why.

🚨 Wiki webm: keyframes every 1 s — NEVER upload raw recordings

Every webm uploaded to the wiki must be re-encoded with a keyframe
interval of ~1 s (-g 30 -keyint_min 30 for 30 fps). Without that,
the wiki server's video element can't seek — the user can play the
clip start-to-end but can't scrub the timeline, jump to a section, or
share a deep-linked timestamp. <video> seeks land on the nearest
keyframe; without dense keyframes, the entire clip is one giant
seekable unit.

Raw adom-cli hydrogen recording stop output and raw CDP-frames mux
output default to one keyframe at frame 0 only — every subsequent
frame is a P-frame, no scrub possible. This bites every time someone
uploads a single-shot recording straight to adom-wiki asset upload
without going through the video-post pipeline. (Bit me on
molecules/brianna-led-nameplate's brianna-demo.mp4; bit Kyle the
same day.)

The rule: the LAST ffmpeg pass on any clip destined for the wiki
must include -g 30 -keyint_min 30. For 60 fps recordings use
-g 60 -keyint_min 60. The exact incantation depends on codec:

Codec	Keyframe flags
`libvpx-vp9` (webm)	`-g 30 -keyint_min 30`
`libx264` (mp4)	`-g 30 -keyint_min 30 -force_key_frames "expr:gte(t,n_forced*1)"`
`libvpx` (legacy webm)	`-g 30 -keyint_min 30`

The -force_key_frames expression on x264 is belt-and-suspenders —
some x264 builds honor -g as a soft hint and let scene-detection
override it; the expression forces a keyframe at integer-second
boundaries regardless.

Verify after encode:

ffprobe -v error -select_streams v:0 -show_entries packet=pts_time,flags \
  -of csv=p=0 final.webm | grep ',K' | head -10
# Should show keyframes (",K_") at ~0.0s, ~1.0s, ~2.0s, ...
# If you only see one ",K_" line at 0.0s, the encode dropped the rule.

If you find yourself running adom-wiki asset upload --asset-type video --file <raw>.webm without a re-encode pass first, stop.
Pipe through ffmpeg -i raw.webm -c:v libvpx-vp9 -b:v 2M -g 30 -keyint_min 30 -c:a copy out.webm first, even on a single-shot
recording you weren't going to edit. The re-encode adds ~10 s for a
30 s clip and saves the user from a non-scrubbable wiki video.

The video-post concat path SHOULD apply this on the final mux — if
your demo went snippet → manifest → concat, you're covered. The
single-shot path (just recording stop → upload) is where this gets
silently skipped, so the rule above applies especially there.

Recording surface — Hydrogen by default, pup when Hydrogen is busy

The default recording surface is adom-cli hydrogen recording start — the
Hydrogen webview captures the full editor (panels + workspace) into a clean
webm, with caption show integrated.

When Hydrogen is unavailable — another Adom tool is occupying the webview
panels (e.g. aci running a build, adom-tsci showing a 3D preview, an
existing demo storyboard up for review) — switch to pup (a separate
Chrome window via Puppeteer). Pup recordings happen on the user's Windows
desktop, fully orthogonal to anything happening inside Hydrogen, so the two
surfaces never compete.

Surface	When	Capture mechanism	Caption tooling	Notes
Hydrogen	Default. Demoing a Hydrogen-hosted feature.	`adom-cli hydrogen recording start --reason ...` → MediaRecorder on the workspace iframe.	`adom-cli hydrogen caption show "<text>"` overlays cleanly.	Best resolution + cleanest output. Don't use if another tab in the same workspace is occupying webviews you'd want to recapture later.
pup	"Aci is using webviews", "don't touch my Hydrogen layout", "the tool I'm demoing has its own UI in a Chrome tab anyway."	`adom-desktop browser_open_window` + `browser_focus_window` + `browser_record_start` (CDP screencast on the pup tab).	No `caption show` — render the caption inside the page itself with a temporary `<div>` injected via `browser_eval`, OR rely entirely on TTS narration.	Frame rate is paint-throttled by Chrome unless the window is OS-foregrounded — see "🔥 The pup foreground-rebump trick" below. Tar pulls of large recordings may exceed the 60-second WebSocket timeout — pull frames in batches of ~40 if `pull_file` returns null.

🔥 The pup foreground-rebump trick — required for high frame rate

Chrome paint-throttles any tab whose window is not the OS foreground. CDP Page.startScreencast only emits a frame when the page repaints, so a backgrounded pup tab silently drops to ~5 fps even when you asked for 15 fps. Symptoms: 275 captured frames in 46 s of wall-clock = ~6 fps real, video plays back ~3× faster than reality, narration desyncs from action.

The trick: browser_focus_window calls Win32 SetForegroundWindow to raise the pup window to the OS foreground. Calling it ONCE before browser_record_start is not enough — anything that grabs OS focus during the recording (Hydrogen tab switch, a notification, the user clicking elsewhere) re-throttles the pup tab. The fix is to re-bump foreground before every action, not just at the start.

SESS="demo-snippet"

# Open + initial focus
adom-desktop browser_open_window "$(jq -nc --arg s "$SESS" --arg u "$URL" \
  '{sessionId:$s, profile:$s, url:$u}')"
sleep 4
adom-desktop browser_focus_window "$(jq -nc --arg s "$SESS" '{sessionId:$s}')"
sleep 1

# Re-bump RIGHT BEFORE record_start — this is the critical one
adom-desktop browser_focus_window "$(jq -nc --arg s "$SESS" '{sessionId:$s}')"
RID=$(adom-desktop browser_record_start "$(jq -nc --arg s "$SESS" \
  '{sessionId:$s, fps:15, quality:75}')" | jq -r '.recordingId')

# Wrap every action so foreground gets re-bumped before each one
focused_eval() {
  adom-desktop browser_focus_window "$(jq -nc --arg s "$SESS" '{sessionId:$s}')" >/dev/null
  adom-desktop browser_eval "$(jq -nc --arg s "$SESS" --arg e "$1" \
    '{sessionId:$s, expr:$e}')" >/dev/null
}

focused_eval 'document.getElementById("btn-foo").click()'
sleep 3                        # short sleeps are fine — focus persists across them
focused_eval 'document.getElementById("btn-bar").click()'
sleep 3
adom-desktop browser_record_stop "$(jq -nc --arg s "$SESS" --arg r "$RID" \
  '{sessionId:$s, recordingId:$r}')"

Verification — sniff-test the captured frame count vs wall-clock duration:

# After every recording, check this BEFORE you commit to mux + TTS:
STOP=$(adom-desktop browser_record_stop ...)
FRAMES=$(echo "$STOP" | jq -r '.frameCount')
DUR_MS=$(echo "$STOP" | jq -r '.durationMs')
ACTUAL_FPS=$(python3 -c "print(${FRAMES}/(${DUR_MS}/1000))")
echo "captured ${ACTUAL_FPS} fps (asked for 15)"
# Below 12 fps → the window dropped foreground at some point. Re-record with the rebump pattern, don't try to mask with playback speedup.

Don't substitute browser_alert_window — it flashes the taskbar but explicitly does NOT steal foreground (so Chrome stays paint-throttled). Don't substitute desktop_bring_to_front either — it works but per the project memory ("use browser_focus_window for pup, not desktop_bring_to_front"), the browser-typed call is the canonical one.

Side-effect to know about: because SetForegroundWindow is OS-level, the user briefly sees the pup window raise to front during recording. Tell them in advance ("you'll see a Chrome window pop forward for ~30 s") so they don't think your code is fighting their workspace.

Foreground alone isn't enough — also keep something painting

Foreground-rebump fixes Chrome's window-level paint throttle, but there's a second throttle: CDP Page.startScreencast is event-driven. It only emits a frame when the page actually repaints. A static, idle page (e.g. "user looking at the player view, video paused, no animation") produces zero frames per second even if the window is fully foregrounded.

Symptom: 7 seconds of "let the player view sit" captures 1 frame. The whole demo plays back too fast.

Two fixes, pick one:

Keep a small animation always running on the page. A 1-second CSS pulse on a non-distracting element does it — e.g. on the .hud .dot:
```
.hud .dot { animation: hud-pulse 1s infinite ease-in-out; }
@keyframes hud-pulse { 0%,100%{opacity:.6} 50%{opacity:1} }
```
This keeps the compositor running, which keeps frames flowing through CDP. Cost: a barely-visible animated dot. Benefit: full requested frame rate during static sections.

Force micro-motion via browser_eval between actions when you can't add CSS to the target page. Inject a 1-pixel scroll back-and-forth every 200 ms, or update a CSS variable that affects a transform:

# Background loop nudging the page during the recording
(
  while [ -f /tmp/avdemo/.recording ]; do
    focused_eval 'document.documentElement.style.setProperty("--nudge", Math.random())'
    sleep 0.2
  done
) &

Sniff-test with the same actual_fps formula above: if it's still <12 after the foreground rebump, the issue is paint-throttling on a static page, not OS focus.

Before you re-record — try Hydrogen instead

If you've fought pup paint-throttling for two attempts and the demo still isn't smooth, switch to Hydrogen recording (adom-cli hydrogen recording start). Hydrogen captures the workspace via getDisplayMedia + MediaRecorder, which polls the framebuffer at the requested rate regardless of whether the page is repainting. Cost: you need the Hydrogen workspace free of competing tabs. Benefit: full 30 fps with no rebump tricks.

Pup is the right answer when Hydrogen is genuinely occupied. When Hydrogen is free, prefer it.

Pup-recording per-section pattern (substitute for the Hydrogen recording
block in Step 5 below when the user's Hydrogen is busy):

URL="https://<container>.adom.cloud/proxy/$PORT/"   # the proxied URL of the tool being demoed

# Per section:
SESS="demo-$ID"
adom-desktop browser_open_window "$(jq -nc --arg s "$SESS" --arg u "$URL" \
  '{sessionId:$s, profile:$s, url:$u}')"
sleep 4
adom-desktop browser_focus_window "$(jq -nc --arg s "$SESS" '{sessionId:$s}')"
sleep 1

RID=$(adom-desktop browser_record_start "$(jq -nc --arg s "$SESS" \
  '{sessionId:$s, fps:15, quality:75}')" | jq -r '.recordingId')

# ... drive the section via browser_eval calls ...
adom-desktop browser_eval "$(jq -nc --arg s "$SESS" --arg e "<JS>" \
  '{sessionId:$s, expr:$e}')"

STOP=$(adom-desktop browser_record_stop "$(jq -nc --arg s "$SESS" --arg r "$RID" \
  '{sessionId:$s, recordingId:$r}')")
TAR=$(echo "$STOP" | jq -r '.tarPath')

adom-desktop pull_file "$(jq -nc --arg t "$TAR" '{filePaths:[$t], saveTo:"/tmp/demo/tar"}')"
adom-desktop browser_close_window "$(jq -nc --arg s "$SESS" '{sessionId:$s}')"

# Mux the CDP frames → webm
EXTRACT=/tmp/demo/tar/${ID}-extract && mkdir -p "$EXTRACT"
tar -xf "/tmp/demo/tar/$(basename "$TAR")" -C "$EXTRACT"
FRAME_DIR=$(find "$EXTRACT" -name ffmpeg-concat.txt -printf '%h\n' | head -1)
( cd "$FRAME_DIR" && ffmpeg -y -loglevel error -f concat -safe 0 \
    -i ffmpeg-concat.txt -c:v libvpx-vp9 -b:v 2M -deadline realtime \
    -cpu-used 8 -row-mt 1 -g 30 -keyint_min 30 \
    -an "/tmp/demo/raw/${ID}.webm" )

# Then standard TTS + mux + manifest add (same as Hydrogen path)
adom-tts say "<narration>" --out /tmp/demo/tts/${ID}.mp3 --voice en-US-AndrewMultilingualNeural
ffmpeg -y -loglevel error -i /tmp/demo/raw/${ID}.webm -i /tmp/demo/tts/${ID}.mp3 \
  -filter_complex "[0:v]tpad=stop_mode=clone:stop_duration=10[v]" \
  -map "[v]" -map 1:a -c:v libvpx-vp9 -b:v 2M -deadline realtime -cpu-used 8 \
  -g 30 -keyint_min 30 \
  -c:a libopus -b:a 96k -shortest /tmp/demo/narrated/${ID}.webm
video-post manifest add /tmp/demo/manifest.json --id "$ID" --title "..." \
  --description "..." --raw /tmp/demo/narrated/${ID}.webm /tmp/demo/narrated/${ID}.webm

The tpad=stop_mode=clone:stop_duration=10 filter holds the last frame so
short recordings don't cut the narration off — -shortest then trims the
output to whichever stream ends first (almost always the audio).

Step 1: Ask how they want to record

Present the user with a choice using AskUserQuestion:

How would you like to record?

Quick recording — I'll start recording your screen with your mic on.
You narrate live, stop when done. Ready to upload immediately.

Polished recording — I'll write a sectioned script first, then record
each section as its own clip, generate TTS per section, and video-post
concats them into one final video. This is the canonical path for
feature-tour demos.

Just post-process — I already have a recording. (→ hand off to
video-post skill)

Quick recording

adom-cli hydrogen recording start --reason "Descriptive reason here"
# User narrates and demonstrates live
adom-cli hydrogen recording stop   # saves to ~/project/recordings/

Done. Upload to wiki via adom-wiki asset upload.

🎬 Polished recording — record sections INDIVIDUALLY (non-negotiable)

The rule: a polished demo is a sequence of separately-recorded
clips, one per section / feature, each with its own TTS narration.
video-post concat stitches them together at the end. Do NOT record
one long take covering every feature. Reasons:

A single long recording turns every mistake into a re-shoot of
everything. One-take demos always have at least one mumbled line
or a mis-click and then you re-record 4 minutes to fix a 5-second
flub.
Per-clip TTS means Andrew-neural (or whatever voice) narrates each
section cleanly. A single long TTS over a mixed recording loses
the intentional pauses between features.
Hero image selection is trivial when each section is its own file —
just pick the best frame from clip #1 or #2.
Re-shoots are surgical: if the Measure tool scene is bad, re-record
measure.webm, update the manifest entry, video-post concat again.
The other 12 clips are untouched.

The flow (detailed in Steps 2-8 below):

Write a sectioned script (one section per feature / example).
Set up a manifest: video-post manifest init --output /tmp/<name>.json.
For each section:
- Display a caption in the workspace (adom-cli hydrogen caption show "<section title>").
- Start a recording (adom-cli hydrogen recording start --reason "<section name>").
- Drive the UI silently for that section's beats.
- Stop the recording.
- Generate TTS: adom-tts say "<narration for this section>" --out /tmp/clips/<section>.mp3 (shares the Adom pronunciation-override table + source-hash cache — see adom-tts skill). Falls back to edge-tts --voice en-US-AndrewNeural --text "..." --write-media ... only if adom-tts isn't installed.
- Mux TTS onto the clip with ffmpeg (audio+video → one .webm).
- Register: video-post manifest add --id <id> --title "<title>" --description "<full INTENT for this clip>" <clip.webm>.
🛑 MANDATORY: open video-post storyboard <manifest> and tell the
user to review each clip one-by-one. Wait for their feedback. Fix
each bad clip, re-run storyboard. Do NOT skip to concat / upload.
Concat the reviewed manifest into one video (use ffmpeg concat
demuxer — video-post concat currently drops audio; known bug).
Verify audio is present (ffprobe | grep codec_type should list
both video and audio).
Extract a hero image (ffmpeg -ss <t> -i final.webm -frames:v 1 hero.png).
Publish: adom-wiki asset upload <page> --asset-type video --file final.webm and another upload for the hero image.

Follow Steps 2-8 below for the detail on each phase.

Why the storyboard gate exists (read this before skipping it)

AI-made demos fail in specific, localized ways: "clip 5 measured the
wrong pads," "clip 8 never clicked anything," "clip 11 walkthrough
didn't advance past step 1." The human reviewer catches these
instantly when clips are numbered + described + individually
playable. On a 3–5 minute uninterrupted video they can't — they'd
have to scrub to the minute:second timestamp and guess which segment
you meant. Skipping storyboard forces the user to do something that's
genuinely hard, and turns the feedback loop from "clip 5 is broken"
(precise) into "something's wrong somewhere" (useless).

Every manifest add carries a --description so the user can read
your INTENT per clip before hitting play. If your intent is wrong
(e.g. you thought "measure tool = some clicks" instead of "measure
tool = commit three pad-to-pad distances showing HUD readouts"), they
can catch the intent bug without even watching the clip.

Step 2: Write the demo script — self-contained, reproducible, checked into the repo

The demo script is a first-class artifact of the repo, not scratch work. It must be checked into the project git repo at <repo>/demo/<name>-demo-script.md (not just /tmp/) and — at the end of Step 7 — published to the wiki so other Adom users can run, review, or tweak the demo without digging through chat history. Treat it the way you'd treat a Makefile: if another person (or another Claude) can't reproduce the demo from this file alone, the file is incomplete.

Repo location. <repo>/demo/<name>-demo-script.md is the canonical home. Put companion files (record.py, mux.py, generated manifest, any custom TTS overrides) in the same demo/ directory. /tmp/ paths are fine during execution but the source of truth lives in the repo so git history tracks every demo revision.

The script MUST be self-sufficient — every piece, one file

The script is a runnable spec. A fresh Claude thread (or a teammate) reading it should be able to re-render the entire demo without asking any questions. That means every setting that affects the output goes in the file, not in the assistant's head:

Metadata header — version being demoed, target duration, recording date, hero frame timestamp, final output path
Tooling deps — edge-tts / adom-cli / video-post / ffmpeg versions / required env vars
Voice + rate — en-US-AndrewMultilingualNeural / +5% / any per-scene overrides
Workspace setup — canonical split ratio (default 0.30 AI / 0.70 feature), which panels, which apps pre-opened
Scene list — numbered, with captions, interactions, commands, panel choreography, speedup markers, and narration text (see table below)
Manifest path — canonical /tmp/<name>-manifest.json location + <repo>/demo/manifest.json checked-in copy
Reproduction block — the exact commands another user runs to regenerate the demo end-to-end (see template below)
Wiki page target — where this demo publishes (apps/<page>)

Omitting any of the above means the next person who tries to regenerate the demo has to reverse-engineer your choices from the muxed output. Don't do that to them.

Template — start every demo script from this shape

# <Product Name> <version> — Feature Tour

**Target duration:** 4:30–5:00
**Recorded:** 2026-04-23
**Final output:** `~/project/recordings/<name>-<version>-tour.webm`
**Wiki target:** `apps/<page>`
**Hero frame:** `00:12` of scene 1 (BME680 first iso shot)

## Dependencies

- `edge-tts` — `pip install edge-tts` (tested with 6.1.x)
- `adom-cli` — container-provided
- `video-post` — built from `/home/adom/project/video-post/` (`cargo build --release`)
- `ffmpeg 6+` — container-provided
- Env vars: none required beyond the container defaults

## Recording config

- **Voice:** `en-US-AndrewMultilingualNeural`
- **Rate:** `+5%`
- **Workspace split:** 0.30 (VS Code / AI left, feature pane right)
- **Captions:** on, via `adom-cli hydrogen caption show`
- **Resolution:** native (1280×…)

## Scenes

| # | id | Caption (on-screen) | Narration (TTS, phonetics applied) | Target length | Commands + choreography |
|---|---|---|---|---|---|
| 1 | intro | `adom-tsci 1.3.7` | `This is adom t s c i version one point three point seven — the interactive t s circuit preview.` | 8s | `adom-tsci view isometric --port 8853` — cinematic slow orbit |
| 2 | examples | `8 example molecules` | `Adom ships eight example molecules — let's open a BME680 breakout.` | 10s | switch to example folder, `adom-tsci start <dir>` |
| … | … | … | … | … | … |

(13 scenes for the adom-tsci 1.3.7 tour; ~5:00 total)

## Reproduction

Another user regenerates this demo end-to-end with:

\`\`\`bash
cd <repo>
python3 demo/gen_tts.py           # renders TTS from this script's narration column
python3 demo/record.py            # drives scene-by-scene recording
python3 demo/mux.py               # muxes TTS onto clips, concatenates final
video-post storyboard /tmp/<name>-manifest.json   # review
# after review:
video-post concat /tmp/<name>-manifest.json --kind raw --output ~/project/recordings/<name>-<version>-tour.webm
adom-wiki asset upload apps/<page> --asset-type video --file ~/project/recordings/<name>-<version>-tour.webm
\`\`\`

## Known issues from this rev

- Clip 02 originally recorded as one 2:17 mega-clip; split into 8 × 12s clips in revision 2 for surgical re-shoot support.
- TTS in rev 1 mispronounced `adom-tsci` as one word; fixed in rev 2 by spelling as `adom t s c i` in narration column.

Why the repo + wiki double-home

In the repo: git history tracks every demo revision. If someone files an issue "the demo is out of date after v1.4.0 ships" you can diff the script.
On the wiki: other Adom users find it when browsing apps/<page>. They can click "reproduce" and a Claude thread says "I see a demo-script for this app — want me to re-record against the current version?"
Both: git is the source of truth (atomic with code changes); wiki is the discoverable surface.

Copy this script file during recording, don't write straight to `/tmp/`

mkdir -p <repo>/demo
cp /tmp/<name>-demo-script.md <repo>/demo/<name>-demo-script.md    # if you drafted in /tmp
# ...or write directly to the repo path in the first place (preferred).

Before touching the record button, show the script to the user for approval. Save to <repo>/demo/<name>-demo-script.md.

🗣️ TTS pronunciation — consult the `tts-pronunciation` skill

Neural TTS engines mispronounce many Adom product names, acronyms, and hyphenated terms. The canonical table lives in the tts-pronunciation skill (data file: pronunciations.json). For every narration scene, look up each proper-noun / acronym / version-number in that table and substitute its narration spelling into the narration column. Captions and on-screen text keep the real spelling; only the narration-column text gets phoneticized.

Most common fixes (the full table is in the tts-pronunciation skill):

adom-tsci → adom t s c i
JLCPCB / FPGA / PCB / SPI → spell letters with spaces
1.3.7 → one point three point seven
KiCad → kai cad
BOM → bill of materials (TTS engines read "BOM" as the word "bomb")

If a term you need isn't in the table, add it. The contribution flow is in tts-pronunciation/SKILL.md — verify the mispronunciation with a one-shot edge-tts, find a spelling that fixes it, append to pronunciations.json, commit. Every user benefits.

Verify before rendering the full tour:

edge-tts --voice en-US-AndrewMultilingualNeural \
  --text "adom t s c i version one point three point seven" \
  --write-media /tmp/_test.mp3
# Listen with ffplay /tmp/_test.mp3 — it should spell out T S C I, not say "sci" as a word.

Fixing narrations at script time costs seconds. Fixing them after you've muxed 13 clips means re-running TTS, re-muxing, re-concat, re-storyboarding.

📏 Clip duration ≈ narration duration, ±15%

Don't let a clip's video length drift too far from its narration length. A 2:17 video with 0:30 of narration is broken either way:

Either the narration is too thin — tighten the video ("cycle through 8 example molecules" is one scene if narrated in 30s, but only if you can render all 8 in 30s).
Or the video is genuinely 2:17 — then split it into multiple clips, each with its own matched-length narration.

Split-before-speedup heuristic. When a single clip concept spans distinct sub-actions (open / click / close, or cycle-through-N-things), the correct path is N clips with N narrations, not one mega-clip with a speedup pass. Reasons:

Surgical re-shoots — if one of 8 examples fails to render, a single split clip rerecord fixes it. A mega-clip has to be fully rerecorded.
Pacing control — app-side lag (e.g. loading a new example molecule) bakes into a mega-clip. Smaller clips let you cut-on-ready.
Narration clarity — "now we see the BME680... and the BHI360... and the SN65HVD230..." is unnatural. "This is the BME680." [clip break] "The BHI360." is tight.

The script MUST include:

Scene list

Numbered scenes with:

What's on screen
Commands to run
Expected duration (real time + after speedup)

🎙️ Narration text for every scene — the TTS that will be rendered

The full spoken narration for every scene goes in the demo script, visible to the user for approval BEFORE any recording or TTS rendering happens. This is the single most important line-item in the script. Everything else (captions, interactions, commands) is cheaper to change than narration once rendered — if you skip the script-time review and discover the wrong pronunciation / wrong wording / wrong length at mux time, you're re-rendering N clips to fix it.

The script table should have a dedicated Narration column next to each scene:

Scene	Caption (seen)	Narration (heard) — TTS-safe phonetics applied	Target length
1	`adom-tsci 1.3.7`	`This is adom t s c i version one point three point seven — the interactive t s circuit preview.`	8s
2	`BME680 breakout`	`Let's open a BME680 breakout molecule. Four in one sensor: temperature, humidity, pressure, gas.`	10s
…	…	…	…

Rules for the narration column:

Apply the TTS pronunciation table (above) — the user approves the phonetic spelling, so what they read in the script is exactly what the TTS will say. adom-tsci never appears literally in narration column; it's always adom t s c i.
Target length per scene must be consistent with the planned video length (±15%, per the section below). If a scene is planned for 15 s of video, narration is ~80–120 words. Count words as a sanity check; at 160 wpm spoken pace that's ~13–17 s.
User-approved narration text is the only source of truth for edge-tts --text "…" later. No paraphrasing at record time. If you need to change narration during recording, stop and update the script (and re-get approval).
Narration ≠ caption. Caption is short, on-screen, SEO-ish ("BME680 breakout"). Narration is conversational, spoken ("Let's look at the BME680 breakout molecule…"). Both are required. Don't reuse one for the other.

Why it lives in Step 2 (approved) not Step 4 (rendered): Fixing a typo or a pronunciation issue in the approved script costs seconds. Finding it after mux costs 13 × (TTS render + ffmpeg mux + concat) minutes + another storyboard-review round. The script-approval step is where all cheap-to-fix-now mistakes get caught.

Captions for every scene

Always generate a caption for every scene. Captions are rendered live in
the workspace during recording via adom-cli hydrogen caption show. They
serve two purposes:

Make the video self-explanatory — viewers understand each scene on mute
Narrator cues — during the voiceover pass, the user sees the captions
in the video and knows exactly when and what to say for each scene

Display captions at the START of each scene:

adom-cli hydrogen caption show "Isometric view — full board layout" -d 5

Hide when transitioning: adom-cli hydrogen caption hide

Do NOT use video-post --label for captions. Those are ugly ffmpeg text
overlays. Use adom-cli hydrogen caption show for clean, rendered captions
that appear natively in the workspace.

Do NOT use Adom Viewer (AV) for captions or narration — ever. No
html_interactive caption panels, no av_display narration overlays, no
custom AV widgets that exist just to show status text. The Hydrogen caption
overlay is the only supported captioning path for demos and for any live
narration of a multi-step task ("show me what's happening as you go"). AV is
for visual content (3D models, schematics, gerbers, scope screenshots) —
never for captions about that content. This applies to every demo skill,
including tours and any ad-hoc "narrate while you work" requests.

Show the full caption table to the user as part of the script:

Scene	Caption
1	`shotlog health — server running`
2a	`Isometric view — full board layout`
2b	`Top-down — trace routing and pad layout`
3	`Timeline — 4 screenshots with descriptions`

Panel choreography

If the demo involves multiple panels (e.g. a 3D viewer AND a log viewer),
script explicit panel switches between each action so the viewer sees both
sides. Don't run 4 CLI commands with one panel visible — alternate:

1. Show 3D viewer → rotate to isometric
2. Switch to shotlog → screenshot appeared in timeline
3. Switch to 3D viewer → rotate to top-down
4. Switch to shotlog → 2 entries now

This makes the demo visually interesting instead of staring at one panel while
CLI commands run off-screen.

Speedup markers

Mark slow operations (exports, searches, AI latency, screenshot capture) with
their speedup factor. Do NOT use --label for captions — speedup labels are
only for internal tracking. Visual captions come from adom-cli hydrogen caption.

Speedup: 10x — screenshot capture + inject
Speedup: 20x — exporting gerbers

Voiceover draft

Write a short narration script the user can read during the voiceover pass.
The live captions displayed during recording serve as narrator cues — when
the user watches the sped-up video during voiceover, they see each caption
appear and know exactly what to say at that moment. Structure the voiceover
script as caption-keyed paragraphs:

[Caption: "Isometric view — full board layout"]
"Here's our CAN transceiver molecule from the isometric angle, showing the
full board with standoffs and the SN65HVD230 IC in the center."

[Caption: "Top-down — trace routing"]
"From the top, you can see the trace routing and pad layout clearly."

Keep total narration under 60 seconds for most demos.

Target final duration

Aim for 30-60s. Nobody watches a 5-minute demo.

Step 3: Verify your setup (ralph loop yourself)

Before hitting record, screenshot the workspace and LOOK at it. Check:

Are the right panels visible and loaded? (not empty states, loading spinners,
dead webviews, or cloud icons)
Is the content you're about to demo actually rendering?
Is the workspace layout clean — no overlapping panels, no stale tabs?
Do all the tools/CLIs you need actually work? (test a command first)

If anything looks wrong, fix it before recording. Do NOT start recording and
hope for the best. This is the #1 cause of wasted takes.

Common setup pitfalls:

Webview tabs showing blank/cloud icon → navigate with full DNS URL, not
relative /proxy/ path
Shotlog server has stale VSCODE_IPC_HOOK_CLI → restart shotlog so it picks
up the current socket
adom-vscode not installed → gh release download from adom-inc/adom-vscode
Panel not loading → check the server is running (health command)

🎞 Keep motion on screen — no static panels

Any demo scene that lingers on a table, list, or panel needs visible
action happening inside it, not a held still. Examples:

BOM / Parts list — click through several rows, not just one. Each
click swaps the detail view on the right and makes the scene feel
alive. 3-4 part clicks over 15 seconds beats one click and dead air.
Components HUD — toggle a couple of groups or individual chips
on/off rather than staring at the list.
Schematic / PCB tabs — pan or zoom so the artwork isn't frozen.
Walkthrough / tour — let it auto-advance; don't pause on a step.

A frozen panel reads as "this tool is boring." Action — even small
clicks that add up to a few state changes — reads as "this tool is
responsive and worth learning." Every scene longer than 8 seconds
needs at least one visible interaction.

📐 Frame the split at 30% AI / 70% feature — don't maximize the feature pane

When recording adom-* demos the VS Code pane (showing Claude Code live in
the chat) MUST stay meaningfully visible. The canonical ratio is 0.30
— 30% of the width is VS Code / Claude chugging, 70% is the feature
(adom-tsci, KiCad, Fusion, etc.). That's the Adom pitch: AI-driven EDA.
The viewer needs to see Claude actually doing the thing they're
watching in the feature pane. A full-width feature pane with no chat looks
like any other tool. A split where Claude is chugging on the left and the
feature is responding live on the right is what makes Adom Adom.

# Canonical: 30% AI / 70% feature
adom-cli hydrogen workspace resize --split-id "$SPLIT_ID" --ratio 0.30

# Wrong: feature pane maximized, AI hidden
adom-cli hydrogen workspace resize --split-id "$SPLIT_ID" --ratio 0.01
# Wrong: too much AI, feature squeezed
adom-cli hydrogen workspace resize --split-id "$SPLIT_ID" --ratio 0.50

Exception: if the demo is genuinely AI-free (a static pitch deck, a hero
shot for thumbnails, a no-interaction explainer), maximizing the feature
pane is fine. Default is 0.30 until you have a reason otherwise.

Step 4: Record section by section

For a polished / feature-tour demo, use the per-section flow. One
clip per section, TTS per section, aggregated at the end.

🛑 Screenshot-verify the target tab BEFORE `recording start` — every clip, no exceptions

The rule: between opening/activating a tab and pressing recording start, take a screenshot of the tab and actually LOOK at it. If what you see isn't what you expected to record, you're about to capture that bad state for the entire clip duration — and you won't notice until post-mux storyboard review (best case) or until the user plays the final and asks "why is every example black?" (worst case, cost = full re-shoot).

Why this breaks: adom-cli hydrogen workspace active-tab can silently miss. webview open-or-refresh returns 204 before the page has loaded its WASM / Babylon / whatever. A 3D viewer may need 5-15 seconds after --url lands before the board mesh is actually visible. Every second of recording start before the scene is visually ready = one second of black frames locked into the muxed clip.

The check, one-liner before each recording start:

# Activate, then SLEEP + screenshot + visual verify before recording
adom-cli hydrogen workspace active-tab --name "<tab>"
sleep 2
adom-cli hydrogen screenshot panel --name "<tab>" --reason "pre-record verify for <clip-id>"
# Read the saved file. If it's not showing what you intended — DO NOT RECORD YET.
# Poll-and-retry, longer sleep, re-navigate, etc.

In a driver script the pattern is:

def wait_for_ready(tab_name, check_substring=None, timeout=15):
    for i in range(timeout):
        img = take_panel_screenshot(tab_name)
        if looks_ready(img, check_substring):
            return True
        time.sleep(1)
    raise RuntimeError(f'tab {tab_name} never rendered')

Where looks_ready() does one or more of:

Size/histogram check — a black frame is low-variance; a board has mid-tones
OCR substring — caption or chip label visible
DOM probe via adom-tsci eval — window.viewer && viewer.getScene().meshes.length > 10

Cost of skipping this rule: past incident — shot 36 scenes of a demo, every one of them a blank webview because active-tab missed. Full re-shoot. Same has happened on 18 of 20 scenes in the 1.3.8 tour. Don't be the next case study.

The record loop

MANIFEST=/tmp/<name>-manifest.json
CLIPS=/tmp/<name>-clips/
mkdir -p $CLIPS
video-post manifest init --output $MANIFEST

for each section in script:
  # 1. Show caption (viewer-side context)
  adom-cli hydrogen caption show "<section title>" -d 0     # leave up for whole clip
  # 2. Activate the target tab + any CLI pre-positioning
  adom-cli hydrogen workspace active-tab --name "<feature tab>"
  adom-tsci view isometric --port <port>                    # example
  # 3. 🛑 SCREENSHOT-VERIFY — skip this and you're shooting blanks (see above)
  adom-cli hydrogen screenshot panel --name "<feature tab>" \
    --reason "pre-record verify <clip-id>"
  # Read the file. Confirm the board is visible. Retry if not.
  # 4. Start recording
  adom-cli hydrogen recording start --share tab --mic false --countdown 0 \
    --reason "<section title> — <what you'll see>"
  # 5. Drive UI silently — 8-25 seconds of focused action
  #    adom-tsci view top / hover / click / etc
  # 6. Stop
  adom-cli hydrogen recording stop
  # last recording is saved to ~/project/recordings/<timestamp>.webm
  CLIP=$(ls -t ~/project/recordings/*.webm | head -1)
  mv "$CLIP" "$CLIPS/<id>.webm"
  # 5. Generate TTS narration
  edge-tts --voice en-US-AndrewNeural \
    --text "<narration text for this section>" \
    --write-media "$CLIPS/<id>.mp3"
  # 6. Mux audio onto the video (replace the silent track)
  ffmpeg -y -i "$CLIPS/<id>.webm" -i "$CLIPS/<id>.mp3" \
    -map 0:v -map 1:a -c:v copy -c:a libopus -shortest \
    "$CLIPS/<id>-narrated.webm"
  # 7. Register in manifest — --description is MANDATORY, see below
  video-post manifest add \
    --id "<id>" --title "<short label>" \
    --description "<full intent: what this clip shows, which project, \
                   which features, what the viewer should notice>" \
    --raw "$CLIPS/<id>-narrated.webm" \
    $MANIFEST
  # 8. Hide caption before the next section
  adom-cli hydrogen caption hide

📝 `--description` on every `manifest add` — not optional

Every clip registered in the manifest MUST carry a --description that
says what you INTENDED the clip to show. Not just the feature name:
the project, the interactions, the expected visible payoff. The
storyboard UI (Step 5 below) renders this under each clip so the user
can tell "did Claude even understand what this clip was supposed to
do?" before watching it.

Without per-clip descriptions, the user watches a 3-5 minute video
blind and has to guess which segment is broken. With descriptions, they
look at clip N's intent, watch its 15-second snippet, and tell you
exactly "clip 5 measured the wrong pads" or "clip 8 never clicked
anything." That feedback loop is the whole reason this skill exists.

Rule of thumb: if a teammate reading just the description couldn't
write the clip's ffmpeg filter / interaction script themselves, the
description is too vague. Examples of GOOD descriptions:

✅ "BME680 cinematic slow orbit for 15s. Camera starts iso, completes
one full rotation. No HUDs, no tool activation — just the board."
✅ "iCE40-USB Measure: commit MP1→MP3 (diagonal 181 mm), clear, commit
U1→J1 (chip to USB-C connector, ~50 mm), clear, commit MC_USB_DP→DM.
Each measurement's ΔX/ΔY/distance HUD stays visible for 5s."
❌ "show measure tool"
❌ "inspect demo"

Always use a descriptive --reason on recording start. The user
sees it in the approval dialog. Bad: "Demo recording". Good: "Recording
adom-tsci Inspect tool — hover pads, show net + JLCPCB part card."

TTS voice: en-US-AndrewNeural is the default. Alternatives when
the demo needs variety: en-US-AriaNeural (polished female),
en-US-GuyNeural (warm neutral male). Pick ONE voice for a whole
demo — switching voices mid-demo is jarring.

TTS future-compat: when the aci voice subcommand ships
(ACI_VOICE_API env var + aci voice say "<text>" > out.mp3), prefer
that — it's the same Edge-neural backend but the request goes through
Adom's infrastructure. Until then, edge-tts on PATH is correct.

Step 5: 🛑 MANDATORY — open video-post storyboard and let the user review every clip

This step is not optional. The whole per-section recording flow
exists so the user can audit each clip individually, because AI-driven
demos routinely screw up details a full-video watch can't pinpoint.
Skipping this step — going straight from manifest → concat → wiki
upload — destroys the user's ability to tell you "clip 5 measured the
wrong thing" or "clip 8 never clicked anything." Don't do it.

# Opens a Hydrogen webview listing every clip in the manifest with its
# title, description, and a player. The user can play each one and
# tell you which clips are bad without watching the whole video.
video-post storyboard $MANIFEST \
  --tab-name 'video-post: <name> review' \
  --port 8797

Then:

Actively point the user at the storyboard tab. Activate it,
screenshot to confirm it rendered, tell the user the tab name to look
at. "The storyboard is open in tab video-post: …, 13 clips, take a
look and tell me which ones are wrong."
Wait for user feedback. The user will tell you "clip 5 needs
more parts clicked," "clip 11 walkthrough isn't advancing," etc. —
or just "looks good."
Re-shoot surgically. video-post manifest remove --id <id>,
re-record that one clip, manifest add with updated description,
point the user back at storyboard. Repeat until they approve.
Only then proceed to Step 6 (concat + publish).

Never concatenate or upload a demo the user hasn't seen clip-by-clip.

Step 6: Concat + publish — ONLY after storyboard approval

# Option A: `video-post concat` (drops audio today — KNOWN BUG, use B)
# Option B: ffmpeg concat demuxer (preserves all streams)
LIST=$(mktemp)
for c in $CLIPS/*-narrated.webm; do echo "file '$c'" >> $LIST; done
ffmpeg -y -f concat -safe 0 -i $LIST -c copy \
  ~/project/recordings/<name>-final.webm
rm $LIST

Verify audio is present (voiceover is the whole point) — an all-video,
no-audio output is a bug to catch here, not after upload:

ffprobe -v error -show_streams <name>-final.webm | grep codec_type
# Must show both `video` and `audio`.

Then publish (Step 7). If any section is bad, video-post manifest remove --id <id>, re-record that ONE clip, re-register with updated
description, reopen storyboard for another review pass, then re-concat.
The surgical re-shoot is the whole point.

Step 6: Hero image + supplementary screenshots

Pick one frame from the final video as the wiki hero. Usually the
frame where the marquee feature is clearest (e.g. the Inspect card
hovering a chip). Then pull 2-3 more for the README / wiki body.

ffmpeg -y -ss <t> -i final.webm -frames:v 1 hero.png
shotlog resize hero.png     # keeps wiki pages lean

Step 7: Publish to wiki — video, hero, screenshots, AND the demo script itself

# Final video
adom-wiki asset upload apps/<page> --asset-type video \
  --file <name>-final.webm \
  --caption "<one-line summary — what the viewer learns>"

# Hero image
adom-wiki asset upload apps/<page> --asset-type hero \
  --file hero.png \
  --caption "<the marquee feature in one phrase>"

# Supplementary section screenshots
for s in hero section1 section2 ...; do
  adom-wiki asset upload apps/<page> --asset-type screenshot \
    --file $s.png --caption "<what this shot shows>"
done

# ⭐ Demo script itself — upload as a "reproducible recipe" asset
# so another Adom user (or another Claude thread) can find + replay
# it without spelunking the source repo.
adom-wiki asset upload apps/<page> --asset-type demo-script \
  --file <repo>/demo/<name>-demo-script.md \
  --caption "Reproducible script for this demo — scenes, narration, TTS config, reproduction commands"

Commit the script to the repo in the same PR as the app changes

The demo script is code for the purposes of review and diff. When you release a new version of the app, the demo script gets bumped too (new features → new scenes → new narration). Rule of thumb: the version at the top of the demo script and the version of the app shipping in the PR must match. A PR that bumps the app without bumping the demo script is incomplete.

git add <repo>/demo/<name>-demo-script.md <repo>/demo/record.py <repo>/demo/manifest.json
git commit -m "demo-script: update for <version> — <what changed>"

Why the script is a first-class asset

Discoverability. Another Adom user browsing the wiki app page sees the video and the script. They click the script and learn: "oh, here's exactly how this was demonstrated, and I can re-render it for my fork."
AI-reproducibility. A fresh Claude thread pointed at the wiki page can pull the script and re-record the demo against a new version of the app without hand-holding. That's the payoff of putting the narration + TTS config + reproduction block in the file.
Review + tweak. Someone spots a typo in scene 7's narration? They edit the script file, open a PR, CI re-renders the affected clip, hero regenerates. No mystery about what changed.

The hero image is what shows up on the wiki landing page, in
link-preview cards, and as the social-share image — pick it
deliberately. A board on a white background beats a dark canvas full
of toolbar buttons. A card / panel that's clearly adom-tsci beats a
generic 3D render.

Step 8: 🧹 MANDATORY — restore the user's workspace before you stop

Do not leave the user's Hydrogen workspace cluttered with demo
artifacts. The recording flow adds tabs, swings split ratios to 20/80,
activates panels, and spawns background servers (storyboard, dev-harness,
animation-server, whatever). None of that should persist past "demo
published." Leaving the workspace in its mid-recording state is the
same bug as AI panels parking themselves over VS Code — the user has
to manually drag everything back every single time.

What to restore

Before recording start, capture three things:

Split ratios. adom-cli hydrogen workspace get → walk the tree
→ note every split's ratio. Save them to a temp file (you'll need
them for the restore).
Pre-existing tabs. adom-cli hydrogen workspace tabs → record
{ tabId, name } for every tab that existed before you started. Any
tab that DIDN'T exist before is yours to remove at the end.
The focused / active tab per pane. So you don't leave the user
staring at your storyboard tab after the demo ends.

After adom-wiki asset upload finishes, tear down in reverse:

# 1. Kill background servers you spawned.
pkill -f "video-post storyboard"
pkill -f "target/debug/examples/standalone"   # aci dev-harness
# (add any app-specific dev servers you started, e.g. tsci dev on 8850)

# 2. Remove every tab you added during the demo. Compare the current
#    tab list against the pre-demo snapshot; the delta is yours.
for name in "<tab you added>" "<video-post: ... review>"; do
  adom-cli hydrogen workspace remove-tab --name "$name"
done

# 3. Restore every split's ratio from the pre-demo snapshot.
adom-cli hydrogen workspace resize --split-id "<id>" --ratio <original>

# 4. Re-activate the tab the user was looking at before you started.
adom-cli hydrogen workspace active-tab --name "<pre-demo active tab>"

# 5. Clean up any scratch files you created in the project tree
#    (temp `.kicad_pcb` copies, /tmp/aci-demo-clips/ if you want —
#    though leaving clips in /tmp is generally fine since /tmp is
#    scoped; the hard rule is the user's visible project tree).
rm -f <any temp files you copied into project dirs>

Verification

End the demo with a workspace get screenshot to confirm the tree
matches the pre-demo snapshot (same split ratios, same tab count, same
focused tab). If it doesn't match, keep cleaning until it does.

Why this matters

A user asking for a "demo" is asking for a deliverable (video + wiki
page). They are NOT asking to have their workspace rearranged for half
an hour. The mid-recording 20/80 split + temporary tabs are part of the
recording apparatus, not part of the deliverable. Treat them like
scaffolding — build with them, demolish them when the video is up.

If a future demo skill (like the adom-tsci skill, or an adom-chipfit
demo) inherits this doctrine, it should do the same: snapshot before
record, restore after publish. Record a follow-up feedback memory
entry if you discover an artifact type this skill didn't cover
(e.g. "adom-tsci demo left tsci dev running on :8850" → add a pkill
line to the tool-specific demo flow).

Legacy one-take + speedup path (not preferred)

The older flow recorded a single long take then sped up slow parts
with video-post process --markers. It still works and is the right
call for live narration demos where timing is inseparable from
speech — but for scripted feature tours, the per-section path
above beats it on every axis: easier re-shoots, cleaner TTS, better
hero selection, no "narrate fast here, slow there" timing puzzle.

If you do need the legacy path:

video-post inspect                      # show speedup summary
video-post process --input <file>.webm --markers /tmp/video-post-markers.jsonl
video-post voiceover --input <file>-fast.webm   # live-narration UI
adom-wiki asset upload apps/<page> --asset-type video --file <final>.webm \
  --caption "..."

demo recording