processes/ -> playbooks/ (git mv preserves history; updated cross-refs in ROADMAP, README, subtitles playbook + scripts). playbooks/import-media/README.md v1.0 — 7-step import workflow: stage on onyx -> rsync to nullstone -> chmod -> verify scan -> Items/Counts bump -> optional subtitle pass -> run-log Cross-references docs/05/07/08, ADMIN-GUIDE, README. Mirrors the existing subtitles playbook structure (CHANGELOG + runs/_template). CHANGELOG v1.0 lists known gaps (bin/cleanup-import.sh and bin/normalize.py still doc-only, ROADMAP M6). First run logged: playbooks/import-media/runs/lilo-stitch-2002.md. Lilo & Stitch (2002) imported to /home/user/media/movies/, item c2f4aff133c1b9631500fadf293b0b2f, TMDb 11544, MovieCount 3 -> 4. LibraryMonitor didn't auto-fire — needed manual /Library/Refresh; playbook updated to make this an unconditional step. Source: 1080p BluRay HEVC 10-bit / EAC3 5.1 / 2x PGS embedded subs. Per quality bar (README.md:41) — passes.
4.1 KiB
Subtitle run — Sassy the Sasquatch (2022)
⚠ STOP-GAP — needs v4 WhisperX cross-ref. Owner accepted current subs as "85 %, acceptable" but tracked for full rebuild when v4 lands (ROADMAP H5). See
STOPGAP-SUBS.md.
Recipe version: v3.5 — YouTube auto-CC via yt-dlp + cleaner (v4 WhisperX planned, see ROADMAP) Run date: 2026-05-10 Operator: Claude Code @ onyx session, ai-lab cwd
Source
| Field | Value |
|---|---|
| Episodes | 5 (S01 only) |
| Container | mkv |
| Video | AV1 Main, 1920×1080, 29.97 fps |
| Audio | eng Opus stereo (default) |
| Embedded subs | none (only font / cover-art attachments) |
| Existing sidecars | none |
| Runtime | ~11:20 per episode |
| Distribution | YouTube (THE BIG LEZ SHOW OFFICIAL channel, creator: Jarrad Wright) |
Niche-show indie animation. Same channel hosts Donny & Clarence Show, Mike Nolan Show, Big Lez Saga — all four shows in our library are Jarrad Wright productions distributed YouTube-first.
Series + library context
- Series Id:
b2d1afd8a4a30c59adb42ccaf47376c2 - Library:
767bffe4f11c93ef34b805451a696a4e(TV Shows,/media/tv) - IMDB series:
tt21209936 - TVDB series:
421839 - Per-episode IMDB ids: only S01E01 (
tt21215354) — rest blank in TVDB
Coverage probe — paid + free providers
Three parallel research agents (2026-05-10) checked every realistic source before falling back to YouTube:
| Provider | Hits |
|---|---|
OpenSubtitles.com REST (parent_imdb_id=21209936) |
1 — SASSY THE SASQUATCH.Web-DL.1080p.en S01E01, HI-flagged |
| OpenSubtitles.org legacy XML-RPC | 0 (account login 401 anyway) |
| Addic7ed | 0 |
| SubDL | 0 (subtitles_count: 0) |
| SubSource (Subscene successor) | 0 |
| Podnapisi | 0 |
| OS VIP upgrade | would not unlock anything — VIP is download-cap relief, not coverage. Same catalog as free. |
Conclusion: nothing exists outside YouTube. Buying VIP would not help; the honest path is auto-generated subs.
Outcome
| Season | Eps | Subs fetched | Quality | Notes |
|---|---|---|---|---|
| S01 | 5 | 5 / 5 | YT auto-CC stop-gap (lowercase, no punctuation, names mangled) | Cleaned via lib/yt-clean.py. v4 WhisperX rebuild planned |
Net: 5 / 5 (100 %) — but at the lowest tier of the USER-G quality bar.
Pipeline used
yt-dlp --skip-download --write-auto-subs --sub-langs en-origagainst the official Sassy playlist (PLGMC7oz7XpmDMGrALMQiNXCi9p7aqkWbj) → raw VTT per episode in/tmp/sassy-research/.lib/yt-clean.pycollapses the rolling-window VTT (each cue carries 2-3 stale lines plus the freshly-spoken bottom line) into deduplicated SRT.- SSH cat redirect each cleaned
.srtto nullstone at/home/user/media/tv/Sassy the Sasquatch (2022)/Season 01/<base>.eng.srtwith library filename. - Validation-only library refresh; verified all 5 eps show exactly 1 external eng sub stream.
Reusable pipeline now lives at lib/sub-yt-fetch.sh (wrapper) +
lib/yt-clean.py (cleaner). Same one-liner handles Donny & Clarence,
Mike Nolan, Big Lez Saga (all on the same channel).
Quality known issues
- Lowercase, no punctuation — YT ASR output verbatim
- Proper-noun mishears: "Sassy" →
sasha, "Big Lez" →Big Less - Profanity censored as
[ __ ]— passthrough from YT - Sentence segmentation absent — cues split on word boundaries
These violate STYLE.md "best quality" and "clean" rules. Documented as explicit stop-gap; v4 WhisperX rebuild restores quality bar.
Mike Nolan special-case (deferred)
A YouTube upload titled "MIKE NOLAN SHOW | COMPLETE SEASON | SUBTITLES" posted Oct 2025 carries hand-typed CC tracks. When subbing Mike Nolan, prefer that single video (rip CC tracks) over the per-episode auto-CC playlist path. Note added to v4 roadmap.
Followups
- visually verify one Sassy episode plays in sync (recipe §6) — YT auto-cap timing is usually tight but worth a sanity check
- when v4 WhisperX lands, regenerate Sassy + Donny & Clarence + Big Lez Saga + Mike Nolan in one batch on the 4080 friend node
- for Mike Nolan, try the "COMPLETE SEASON | SUBTITLES" YT upload before falling back to Whisper