Commit graph

5 commits

Author SHA1 Message Date
s8n
fba9a5bfeb processes/subtitles: STOPGAP-SUBS.md tracker for v3.5 → v4 cross-ref
Owner accepted Sassy the Sasquatch S01 v3.5 YouTube-auto-CC subs as
'85 % acceptable, fine, not great' but flagged them for v4 WhisperX
rebuild. Adds a single worklist file (STOPGAP-SUBS.md) so every show
that ships via the v3.5 path gets logged for the eventual v4 sweep
instead of being silently forgotten.

Sassy run log gets a STOP-GAP banner at the top pointing to the new
worklist. README.md gets a stop-gap-exception note alongside the
STYLE.md hard-prereq paragraph. ROADMAP H5 now points at the worklist
file as the canonical source of which shows v4 needs to regenerate.
2026-05-10 01:18:27 +01:00
s8n
eb71cf6beb processes/subtitles: v3.5 YouTube auto-CC stop-gap + Sassy 5/5
Adds lib/sub-yt-fetch.sh (yt-dlp wrapper) and lib/yt-clean.py (collapses
YouTube's rolling-window auto-caption VTT into a flat SRT). For shows
distributed YouTube-first that have no community subs anywhere -- verified
via three parallel research agents covering OpenSubtitles REST, OS legacy,
Addic7ed, SubDL, SubSource, and Podnapisi for the 5 niche shows in the
library, plus a price-vs-coverage analysis of OpenSubtitles VIP.

Findings: OS VIP would not have helped on the niche shows (it is
download-cap relief, not coverage unlock; same catalog as free). All 4
Jarrad Wright shows in the library (Sassy, Big Lez Saga, Donny &
Clarence, Mike Nolan) live on the same channel and have only YouTube
auto-CC available. v3.5 ships those, explicitly violating STYLE.md
'best quality' as a tracked stop-gap.

Sassy the Sasquatch S01 5/5 episodes subbed with cleaned auto-CC. Mike
Nolan special-case noted: a 'COMPLETE SEASON | SUBTITLES' YT upload from
Oct 2025 carries hand-typed CCs and should be preferred over per-episode
auto-CC when subbing that show.

ROADMAP H5 added: v4 WhisperX large-v3 on the friend RTX 4080 node will
regenerate the v3.5 stop-gap with proper-noun-prompted transcription
(~4-6%% WER vs ~12%% YT auto-CC) and restore the STYLE.md quality bar.
H1 OpenSubtitles credentials marked done (was completed 2026-05-09).
2026-05-10 01:05:07 +01:00
s8n
43f55643be processes/subtitles: v3 Addic7ed fetcher + AD 49/58 subbed
Adds lib/sub-a7d-fetch.py: free, no-daily-cap path via subliminal's
addic7ed provider (anonymous). Uses OpenSubtitles REST search-only (no
quota cost) to translate library S/E to the show's primary catalogue
numbering, then drives subliminal to download from Addic7ed and writes
sidecars direct to nullstone via SSH.

Picker quirks: subliminal series-name matcher is broken by '!' in the
title, so the script strips it before building the synthetic
Video.fromname() string. OS feature_details S/E happens to align with
Addic7ed's indexing for the test show (American Dad).

Recipe README now reflects three paths in cheapest-first order: v3
Addic7ed, v2 OS REST (20/day), v1 plugin. American Dad run log updated
to 49/58 (S01 7/7 v1, S02 16/16 mixed v2/v3, S03 16/19 v3, S04 10/16
v3). 9 misses identified, deferred to next OS REST quota window.
2026-05-09 23:31:10 +01:00
s8n
23520df2df processes/subtitles: v2 REST fetcher + AD S02E01-E12 subbed
Adds lib/sub-rest-fetch.py: direct OpenSubtitles REST, looks up subs by
per-episode IMDB id (e.g. tt0511631) instead of the plugin's
(parent_imdb_id, season, episode) combo path. This sidesteps shows where
library numbering diverges from OpenSubtitles' catalogued numbering --
American Dad uses Hulu S1=7 eps; OS uses Fox S1=23 eps; the plugin path
returns 0 hits past S01E07 even though every per-episode IMDB id is
correct.

Recipe README updated to surface the two paths (v1 plugin / v2 REST) and
recommend v2 by default. American Dad run log now shows 19/58 episodes
subbed (S01 7/7 via v1, S02E01-E12 via v2). S02E13-S04 (39 eps) deferred
to next 20/day quota windows.

Quirk fixed in v2: OpenSubtitles /download endpoint consistently returns
HTTP 503 to Python urllib.request despite identical headers/body via curl.
_curl() shim routes all OS API calls through curl. Each 503 still
consumes a download slot, so urllib path was unsafe to retry on.
2026-05-09 23:09:09 +01:00
s8n
fedf3388b8 processes: subtitle acquisition v1 + AD S01 run
Adds processes/ umbrella for repeatable acquisition workflows. First child
is subtitles/, with recipe README (executable by Claude Code), CHANGELOG,
per-show run logs, and a tested helper at lib/sub-fetch.sh.

Run on American Dad: S01 (7 eps) passed, S02-S04 (51 eps) broke. Library
uses Hulu/DSP season ordering; OpenSubtitles indexes by Fox airing order;
plugin queries by (parent_imdb_id, season, episode) so library S02E01
returns 0 hits. v2 design = direct OpenSubtitles REST with per-episode
imdb_id lookup; pending API-key registration.
2026-05-09 22:56:31 +01:00