diff --git a/processes/README.md b/processes/README.md new file mode 100644 index 0000000..09c09ef --- /dev/null +++ b/processes/README.md @@ -0,0 +1,24 @@ +# processes/ — repeatable acquisition workflows + +This folder holds the canonical recipes for **acquiring external content** for +the ARRFLIX library: subtitles, artwork, metadata, episode stills, etc. +Internal ops (encoding, importing, theming) stay in `bin/` and `docs/`. + +Each process is its own sub-folder with three files: + +| File | Purpose | +|---|---| +| `README.md` | The canonical recipe. Step-by-step, executable by Claude Code. Always reflects the latest version. | +| `CHANGELOG.md` | Why the recipe changed, version-by-version. One entry per breakage that forced a revision. | +| `runs/.md` | Evidence log: what happened when this recipe was applied to a specific show. | + +Recipes evolve via the **iteration model**: apply to a show, succeed or break, +amend the recipe to handle the new case + every prior case, retry. A recipe +that "just works" is one that has survived every show in the library without +amendment for a full sweep. + +## Children + +| Process | Status | Last touched | +|---|---|---| +| [`subtitles/`](subtitles/) | v1 — partial pass on American Dad (S01 only); broke on S02 | 2026-05-09 | diff --git a/processes/subtitles/CHANGELOG.md b/processes/subtitles/CHANGELOG.md new file mode 100644 index 0000000..22332a2 --- /dev/null +++ b/processes/subtitles/CHANGELOG.md @@ -0,0 +1,48 @@ +# Subtitle process — changelog + +## v1 — 2026-05-09 + +Initial recipe. Drafted while running on American Dad. Distilled from doc +03-subtitles.md (Futurama work) and the actual AD run. + +Approach: Jellyfin RemoteSearch/Subtitles/eng → pick best non-HI/non-MT match +via Python filter → POST download → docker cp metadata cache → media folder → +delete cache dupes → validation refresh. + +Scope: works on shows whose library season/episode numbering matches +OpenSubtitles' indexed numbering. Verified passing on AD S01 (7/7 episodes). + +### Known break — added 2026-05-09 same day + +After S01 passed, S02 returned 0 results for every episode probed (E01, E02, +E08, E13). Quota was fine (13 downloads remaining). Cause: + +> Jellyfin metadata for American Dad uses **Hulu/DSP season ordering** +> (S1=7, S2=16, S3=19, S4=16). OpenSubtitles indexes by **Fox original-airing +> order** where S1 has 23 episodes. The plugin queries OS by +> `(parent_imdb_id, season_number, episode_number)`. For library S02E01 +> "Bullocks to Stan" the plugin sends `S=2,E=1` but OS catalogues that +> episode as `S=1,E=8`. Result: 0 hits. + +Each library episode has its own correct per-episode IMDB id (e.g. +`tt0511631` for "Bullocks to Stan") which would resolve directly via OS REST +`imdb_id=` parameter, but the plugin doesn't expose that path. + +### v2 — pending design + +Two paths under consideration: + +- **A. Direct OpenSubtitles REST** — bypass plugin for fetch, use per-episode + IMDB id lookup. Requires registering a free API key at + `opensubtitles.com/consumers`. Process becomes a Python script (or extends + the existing helper) that logs in with `Caveman5` creds and uses the API + key for searches. Survives any season-numbering mismatch. + +- **B. Library re-numbering** — re-scan AD with metadata indexer using Fox + airing order so library aligns with OpenSubtitles. Risk: re-orders existing + files and breaks user's mental model of the library. Doesn't help if the + next show has its own numbering quirk. + +Recommendation: **A**. It's the more general fix; the next show with weird +numbering won't break it. It also unblocks higher-quality manual pick (filter +by `feature_id`, `imdb_id`, hash) which the plugin filters out today. diff --git a/processes/subtitles/README.md b/processes/subtitles/README.md new file mode 100644 index 0000000..aa0b055 --- /dev/null +++ b/processes/subtitles/README.md @@ -0,0 +1,193 @@ +# Subtitle acquisition process — v1 + +Last updated: 2026-05-09 +Status: **v1, partial** — passed American Dad S01 (7/7 eps), broke on S02E01 due to season-numbering mismatch. v2 design pending. + +This recipe is written for Claude Code to execute. Each step lists the exact +command, what to verify, and what to do on failure. Background reference for +how Jellyfin and the OpenSubtitles plugin work together lives in +[`docs/03-subtitles.md`](../../docs/03-subtitles.md). + +--- + +## Prereqs (verify before running) + +| Check | How | +|---|---| +| OpenSubtitles plugin v20 installed + Active | `docker exec jellyfin ls /config/plugins | grep -i opensub` | +| Plugin creds saved (`Caveman5`) | `docker exec jellyfin grep -E 'Username\|CredentialsInvalid' /config/plugins/configurations/Jellyfin.Plugin.OpenSubtitles.xml` — expect `Caveman5` and `false` | +| TV library has `SaveSubtitlesWithMedia=true`, `SubtitleDownloadLanguages=["eng"]`, `RequirePerfectSubtitleMatch=false` | `curl -s -H "X-Emby-Token: $TOK" http://localhost:8096/Library/VirtualFolders` | +| Free-tier quota remaining today (≥ episode count, else plan multi-day) | `docker logs --tail 200 jellyfin 2>&1 \| grep "Remaining downloads" \| tail -1` (free = 20/day, resets 00:00 UTC) | +| Source files have audio language tag | `ffprobe` sample episode | + +If any prereq fails, stop. Fix it before running the recipe. + +--- + +## Step 1 — Probe the source + +Pick one episode of the target show. Run `ffprobe` on it: + +```bash +ssh user@192.168.0.100 'docker exec jellyfin /usr/lib/jellyfin-ffmpeg/ffprobe -hide_banner "" 2>&1 | grep -E "Stream|Duration"' +``` + +Record in the run log: + +- video codec + resolution + frame rate +- audio language tag(s) +- whether any subtitle streams are embedded +- container + +Decide based on probe: + +| Probe result | Branch | +|---|---| +| English audio, no embedded subs | "simple" path (this recipe) | +| Foreign-dub audio, no embedded subs | "foreign-dub" path (deferred to v?) | +| Embedded English subs already present | skip — Jellyfin will use them | +| Embedded PGS/VobSub bitmap subs | "OCR" path (deferred to v?) | + +--- + +## Step 2 — Resolve series + episode IDs + +```bash +TOK= +SERIES_NAME='American Dad' +ssh user@192.168.0.100 "docker exec jellyfin curl -s -H 'X-Emby-Token: $TOK' \ + 'http://localhost:8096/Items?searchTerm=${SERIES_NAME// /+}&IncludeItemTypes=Series&Recursive=true&Limit=3'" \ + | python3 -c "import json,sys; [print(x['Id'],x['Name']) for x in json.load(sys.stdin).get('Items',[])]" +``` + +Record series Id. Then list episodes: + +```bash +SERIES= +ssh user@192.168.0.100 "docker exec jellyfin curl -s -H 'X-Emby-Token: $TOK' \ + 'http://localhost:8096/Items?ParentId=$SERIES&IncludeItemTypes=Episode&Recursive=true&Fields=Path,ParentIndexNumber,IndexNumber'" \ + | python3 -c "import json,sys; [print(e['Id'],'S%02dE%02d'%(e['ParentIndexNumber'],e['IndexNumber']),e['Name']) for e in json.load(sys.stdin)['Items']]" +``` + +--- + +## Step 3 — Validate season numbering against OpenSubtitles + +> ⚠️ **Critical, added in v2** (currently provisional — see CHANGELOG): some shows +> are catalogued differently across services. American Dad is the canonical +> example: Hulu/DSP carriers split the original Fox 23-ep S1 into Hulu S1 (7 +> eps) + S2 (16 eps). OpenSubtitles indexes by Fox airing order. The plugin +> queries by `(parent_imdb_id, season, episode)` so library-side Hulu numbering +> returns 0 results past the first 7 episodes. + +How to check: + +1. Pick the first episode of season 2 in the library. +2. Run a `RemoteSearch/Subtitles/eng` against it (Step 4 below, but read-only). +3. If results > 0 — numbering matches OpenSubtitles. Proceed. +4. If results == 0 but the show exists on opensubtitles.com — numbering mismatch. **Stop**. Fix metadata first or use the v2 direct-API path (TBD). + +--- + +## Step 4 — Fetch subs per episode + +Per-episode loop. Helper script lives at `processes/subtitles/lib/sub-fetch.sh` +(promoted from `/tmp` once stable; see CHANGELOG v0→v1). + +```bash +TOK= +EP= +MEDIA_DIR='/home/user/media/tv//Season XX' +MEDIA_BASE=' - SxxExx - ' + +# 1. search +RAW=$(ssh user@192.168.0.100 "docker exec jellyfin curl -s -H 'X-Emby-Token: $TOK' \ + 'http://localhost:8096/Items/$EP/RemoteSearch/Subtitles/eng'") + +# 2. pick best non-HI/non-MT/non-AI/non-Forced match, prefer 23.976fps, then highest DownloadCount +SUBID=$(printf '%s' "$RAW" | python3 -c " +import json,sys +subs=json.load(sys.stdin) +clean=[s for s in subs if not (s.get('HearingImpaired') or s.get('MachineTranslated') or s.get('AiTranslated') or s.get('Forced'))] +if not clean: clean=subs +fps2398=[s for s in clean if abs(s.get('FrameRate',0)-23.976)<0.01] +pool=fps2398 if fps2398 else clean +pool.sort(key=lambda s: -s.get('DownloadCount',0)) +print(pool[0]['Id'] if pool else '')") + +# 3. download (returns 204) +ssh user@192.168.0.100 "docker exec jellyfin curl -s -X POST -H 'X-Emby-Token: $TOK' \ + 'http://localhost:8096/Items/$EP/RemoteSearch/Subtitles/$SUBID' -w 'HTTP %{http_code}\n'" + +# 4. plugin saves to /config/metadata/library/<shard>/<itemId>/<base>.eng.srt +# NOT next to media (manual-pick path ignores SaveSubtitlesWithMedia). +# Move it into place: +SHARD="${EP:0:2}" +ssh user@192.168.0.100 "docker cp \"jellyfin:/config/metadata/library/$SHARD/$EP/$MEDIA_BASE.eng.srt\" \ + \"$MEDIA_DIR/\"" +``` + +Verify after each batch: + +```bash +ssh user@192.168.0.100 'ls "<media-dir>/" | grep -c eng.srt' +``` + +--- + +## Step 5 — Clean up duplicates + library scan + +The metadata-cache copy and the media-folder sidecar both register as +subtitle streams in Jellyfin (counted twice). Delete the cache copies: + +```bash +ssh user@192.168.0.100 'docker exec jellyfin bash -c "find /config/metadata/library -path \"*<show-name>*S0[1-9]E*.eng.srt\" -delete -print"' +``` + +Trigger a validation-only refresh so Jellyfin sees the new sidecars: + +```bash +ssh user@192.168.0.100 "docker exec jellyfin curl -s -X POST -H 'X-Emby-Token: $TOK' \ + 'http://localhost:8096/Items/$SERIES/Refresh?MetadataRefreshMode=ValidationOnly&Recursive=true'" +``` + +Confirm one episode has exactly 1 external eng sub stream: + +```bash +ssh user@192.168.0.100 "docker exec jellyfin curl -s -H 'X-Emby-Token: $TOK' \ + 'http://localhost:8096/Items/<sample-ep-id>?Fields=MediaStreams'" \ + | python3 -c "import json,sys; subs=[s for s in json.load(sys.stdin).get('MediaStreams',[]) if s['Type']=='Subtitle']; print(len(subs),'sub streams')" +``` + +--- + +## Step 6 — Quality gate + +For the run to pass: + +- [ ] **Coverage**: every episode has a matching `<base>.eng.srt` sidecar +- [ ] **Sync sample**: at least one episode of each season is opened in + Jellyfin web and subs visually align with audio (±1 s) on a known dialogue + line +- [ ] **Flag check**: no `.sdh.srt`, `.forced.srt`, or `.hi.srt` files + (machine pick should have filtered) +- [ ] **Stream count**: Jellyfin shows exactly 1 external eng sub per episode + +If any check fails, log it in `runs/<show>.md` under "breakage" and propose +the recipe amendment in `CHANGELOG.md`. + +--- + +## Quota hygiene + +Free OpenSubtitles.com account = 20 downloads / day, resets 00:00 UTC. +Plan large series across multiple days, or switch to VIP (~$3/mo, unlimited). + +Quota check: + +```bash +ssh user@192.168.0.100 'docker logs --tail 200 jellyfin 2>&1 | grep "Remaining downloads" | tail -1' +``` + +When quota hits 0 the API returns 0 results, indistinguishable from a real +miss. Always check quota before declaring a "no subs" failure. diff --git a/processes/subtitles/lib/sub-fetch.sh b/processes/subtitles/lib/sub-fetch.sh new file mode 100755 index 0000000..36490c4 --- /dev/null +++ b/processes/subtitles/lib/sub-fetch.sh @@ -0,0 +1,76 @@ +#!/usr/bin/env bash +# Subtitle fetch helper — recipe v1 Step 4. +# +# Single-episode loop body. Runs against a Jellyfin instance reachable from +# nullstone via `docker exec jellyfin curl ...`. Driver loops should source or +# call this per episode. +# +# Picker: highest DownloadCount among results that are NOT +# (HearingImpaired|MachineTranslated|AiTranslated|Forced); 23.976fps preferred. +# Falls back to all results if every candidate is HI/MT/AI/Forced. +# +# Side effects: +# - POSTs RemoteSearch download (consumes 1 of 20 daily free-tier slots) +# - docker cp's the resulting metadata-cache srt to MEDIA_DIR +# +# Caller env: +# TOK Jellyfin admin X-Emby-Token +# EP Jellyfin episode item id +# MEDIA_DIR destination dir on nullstone, e.g. +# '/home/user/media/tv/American Dad! (2005)/Season 01' +# MEDIA_BASE filename without extension, must match the .mkv basename +# +# Exits non-zero on no-subs (1) or download HTTP != 204 (2). +# Output to stdout: "OK <ep-id> -> <dest path>". +# Output to stderr: chosen sub release name + fps + DownloadCount, or error. + +set -euo pipefail + +: "${TOK:?TOK required}" +: "${EP:?EP required}" +: "${MEDIA_DIR:?MEDIA_DIR required}" +: "${MEDIA_BASE:?MEDIA_BASE required}" + +NULLSTONE="${NULLSTONE:-user@192.168.0.100}" + +RAW=$(ssh "$NULLSTONE" "docker exec jellyfin curl -s -H 'X-Emby-Token: $TOK' \ + 'http://localhost:8096/Items/$EP/RemoteSearch/Subtitles/eng'") + +SUBID=$(printf '%s' "$RAW" | python3 -c " +import json, sys +subs = json.load(sys.stdin) +clean = [s for s in subs + if not (s.get('HearingImpaired') or s.get('MachineTranslated') + or s.get('AiTranslated') or s.get('Forced'))] +if not clean: + clean = subs +fps2398 = [s for s in clean if abs(s.get('FrameRate', 0) - 23.976) < 0.01] +pool = fps2398 if fps2398 else clean +pool.sort(key=lambda s: -s.get('DownloadCount', 0)) +if pool: + print(pool[0]['Id']) + print(pool[0]['Name'], pool[0].get('FrameRate'), + pool[0].get('DownloadCount'), file=sys.stderr) +") + +if [[ -z "$SUBID" ]]; then + echo "NO-SUBS for $EP" >&2 + exit 1 +fi + +HTTP=$(ssh "$NULLSTONE" "docker exec jellyfin curl -s -o /dev/null -X POST \ + -H 'X-Emby-Token: $TOK' \ + 'http://localhost:8096/Items/$EP/RemoteSearch/Subtitles/$SUBID' \ + -w '%{http_code}'") + +if [[ "$HTTP" != "204" ]]; then + echo "DL-FAIL HTTP=$HTTP for $EP $SUBID" >&2 + exit 2 +fi + +SHARD="${EP:0:2}" +SRC_IN_CONTAINER="/config/metadata/library/$SHARD/$EP/$MEDIA_BASE.eng.srt" +DEST="$MEDIA_DIR/$MEDIA_BASE.eng.srt" + +ssh "$NULLSTONE" "docker cp \"jellyfin:$SRC_IN_CONTAINER\" \"$DEST\"" >/dev/null +echo "OK $EP -> $DEST" diff --git a/processes/subtitles/runs/_template.md b/processes/subtitles/runs/_template.md new file mode 100644 index 0000000..c81be49 --- /dev/null +++ b/processes/subtitles/runs/_template.md @@ -0,0 +1,37 @@ +# Subtitle run — `<Show name (Year)>` + +Recipe version: v? +Run date: YYYY-MM-DD +Operator: Claude Code @ <session> +Quota at start / end: ?? / ?? + +## Source + +| Field | Value | +|---|---| +| Episodes | ?? (S01–S??) | +| Container | mkv / mp4 / ... | +| Video | codec res fps | +| Audio | language tag(s) | +| Embedded subs | yes / no — codecs | +| Existing sidecars | yes / no | + +## Outcome + +| Season | Eps | Subs fetched | Quality sample | Notes | +|---|---|---|---|---| +| S01 | ? | ? / ? | ? | | + +## Picks (sample) + +| Episode | Sub Id | Author | DownloadCount | FrameRate | HI | +|---|---|---|---|---|---| +| S01E01 | ... | ... | ... | ... | ... | + +## Breakage (if any) + +What broke, what was probed, what the recipe should have done differently. + +## Recipe amendments triggered + +- v1 → v2: ... diff --git a/processes/subtitles/runs/american-dad.md b/processes/subtitles/runs/american-dad.md new file mode 100644 index 0000000..e5c19fb --- /dev/null +++ b/processes/subtitles/runs/american-dad.md @@ -0,0 +1,85 @@ +# Subtitle run — `American Dad! (2005)` + +Recipe version: v1 +Run date: 2026-05-09 +Operator: Claude Code @ onyx session, ai-lab cwd +Quota at start / end: 20 / 13 (7 downloads, all S01) + +## Source + +| Field | Value | +|---|---| +| Episodes | 58 (S01=7, S02=16, S03=19, S04=16) | +| Container | mkv | +| Video | HEVC Main10, 1440×1080, 23.98 fps, 4:3 SAR 1:1 | +| Audio | `eng` AAC stereo (default) + `eng` AC3 5.1 | +| Embedded subs | none | +| Existing sidecars | none | + +Library uses Hulu/DSP season ordering (S1=7 eps). Original Fox order has S1=23 eps. + +## Series + library context + +- Series Id: `3b3bc999e9107f1a7643ac45d6427fee` +- Library: `767bffe4f11c93ef34b805451a696a4e` (TV Shows, `/media/tv`) +- Library options: `SaveSubtitlesWithMedia=true`, `SubtitleDownloadLanguages=["eng"]`, `RequirePerfectSubtitleMatch=false` ✓ +- Plugin: Open Subtitles v20.0.0.0, Active, creds `Caveman5` valid + +## Outcome + +| Season | Eps | Subs fetched | Quality sample | Notes | +|---|---|---|---|---| +| S01 | 7 | 7 / 7 | not yet visually verified by playback (TODO) | All from `OMiCRON DVDRip` release group, fps 23.976 except S01E07 (24 fps), no SDH | +| S02 | 16 | 0 / 16 | n/a | Plugin RemoteSearch returns 0 for E01/E02/E08/E13 — broke recipe | +| S03 | 19 | 0 / 19 | n/a | Untested, expected same failure | +| S04 | 16 | 0 / 16 | n/a | Untested, expected same failure | + +Net: **7 / 58 (12 %)**. + +## Picks (S01) + +| Episode | Sub release | Author | DLs | FPS | HI | +|---|---|---|---|---|---| +| S01E01 Pilot | `American.Dad.S01E01.DVDRip.XviD.REPACK-OMiCRON` | zetakoo_ | 154 132 | 23.976 | no | +| S01E02 Threat Levels | `American.Dad.S01E02.DVDRip.XviD.REPACK-OMiCRON` | (auto) | 89 896 | 23.976 | no | +| S01E03 Stan Knows Best | `American.Dad.S01E03.DVDRip.XviD.REPACK-OMiCRON` | (auto) | 69 317 | 23.976 | no | +| S01E04 Francines Flashback | `American.Dad.S01E04.DVDRip.XviD.REPACK-OMiCRON` | (auto) | 72 315 | 23.976 | no | +| S01E05 Roger Codger | `American.Dad.S01E05.DVDRip.XviD.REPACK-OMiCRON` | (auto) | 32 309 | 23.976 | no | +| S01E06 Homeland Insecurity | `American.Dad.S01E06.DVDRip.XviD.REPACK-OMiCRON` | (auto) | 67 778 | 23.976 | no | +| S01E07 Deacon Stan Jesus Man | `American.Dad.S01E07.DVDRip.XviD-OMiCRON` | (auto) | 65 124 | 24 | no | + +All chose by recipe Step 4 picker (highest DownloadCount among non-HI / non-MT +/ non-AI / non-Forced, prefer 23.976 fps). Picker behaved consistently — no +manual override needed for S01. + +## Breakage + +After S01 passed, S02E01 search returned 0 results. Verified: + +- ProviderIds for S02E01 in library = `Imdb=tt0511631 Tvdb=306168` (correct for "Bullocks to Stan") +- Plugin quota: 13 / 20 remaining (not exhausted) +- Plugin log shows no error — silent zero +- Same recipe worked 7 times in a row immediately prior — not a script bug +- Sample-tested S02E02 / S02E08 / S02E13 → all 0 results + +Root cause: library numbering is Hulu/DSP (S1=7), OpenSubtitles indexes Fox +airing order (S1=23). Plugin queries OS with `(parent_imdb_id, season, +episode)` so library `S=2 E=1` maps to a Fox cell that doesn't exist on OS +in that S/E slot, even though the per-episode IMDB id (`tt0511631`) is real +and indexed on OS by Fox order as `S=1 E=8`. + +The plugin doesn't expose per-episode-IMDB lookup, only the S/E combo path, +so there's no flag we can flip to make this work. + +## Recipe amendments triggered + +- **v1 → v2**: process needs a season-numbering pre-check (Step 3), and a + fallback fetch path that doesn't rely on plugin S/E mapping. See + `CHANGELOG.md` v2 design choice between direct OS REST (recommended) and + library re-numbering. + +## Followups + +- [ ] visually verify a sample S01 sub plays in sync (one ep per recipe rule §6) +- [ ] decide v2 path (REST vs renumber) +- [ ] sub S02–S04 (51 eps) once v2 lands