Adds lib/sub-yt-fetch.sh (yt-dlp wrapper) and lib/yt-clean.py (collapses YouTube's rolling-window auto-caption VTT into a flat SRT). For shows distributed YouTube-first that have no community subs anywhere -- verified via three parallel research agents covering OpenSubtitles REST, OS legacy, Addic7ed, SubDL, SubSource, and Podnapisi for the 5 niche shows in the library, plus a price-vs-coverage analysis of OpenSubtitles VIP. Findings: OS VIP would not have helped on the niche shows (it is download-cap relief, not coverage unlock; same catalog as free). All 4 Jarrad Wright shows in the library (Sassy, Big Lez Saga, Donny & Clarence, Mike Nolan) live on the same channel and have only YouTube auto-CC available. v3.5 ships those, explicitly violating STYLE.md 'best quality' as a tracked stop-gap. Sassy the Sasquatch S01 5/5 episodes subbed with cleaned auto-CC. Mike Nolan special-case noted: a 'COMPLETE SEASON | SUBTITLES' YT upload from Oct 2025 carries hand-typed CCs and should be preferred over per-episode auto-CC when subbing that show. ROADMAP H5 added: v4 WhisperX large-v3 on the friend RTX 4080 node will regenerate the v3.5 stop-gap with proper-noun-prompted transcription (~4-6%% WER vs ~12%% YT auto-CC) and restore the STYLE.md quality bar. H1 OpenSubtitles credentials marked done (was completed 2026-05-09).
24 lines
1.2 KiB
Markdown
24 lines
1.2 KiB
Markdown
# processes/ — repeatable acquisition workflows
|
|
|
|
This folder holds the canonical recipes for **acquiring external content** for
|
|
the ARRFLIX library: subtitles, artwork, metadata, episode stills, etc.
|
|
Internal ops (encoding, importing, theming) stay in `bin/` and `docs/`.
|
|
|
|
Each process is its own sub-folder with three files:
|
|
|
|
| File | Purpose |
|
|
|---|---|
|
|
| `README.md` | The canonical recipe. Step-by-step, executable by Claude Code. Always reflects the latest version. |
|
|
| `CHANGELOG.md` | Why the recipe changed, version-by-version. One entry per breakage that forced a revision. |
|
|
| `runs/<show>.md` | Evidence log: what happened when this recipe was applied to a specific show. |
|
|
|
|
Recipes evolve via the **iteration model**: apply to a show, succeed or break,
|
|
amend the recipe to handle the new case + every prior case, retry. A recipe
|
|
that "just works" is one that has survived every show in the library without
|
|
amendment for a full sweep.
|
|
|
|
## Children
|
|
|
|
| Process | Status | Last touched |
|
|
|---|---|---|
|
|
| [`subtitles/`](subtitles/) | v3.5 — YouTube auto-CC added as stop-gap for shows with no community subs anywhere (verified via 3-agent research run). AD 49/58 + Sassy 5/5. v4 WhisperX planned (ROADMAP H5) | 2026-05-10 |
|