legacy-arrflix/processes/subtitles/CHANGELOG.md
s8n 43f55643be processes/subtitles: v3 Addic7ed fetcher + AD 49/58 subbed
Adds lib/sub-a7d-fetch.py: free, no-daily-cap path via subliminal's
addic7ed provider (anonymous). Uses OpenSubtitles REST search-only (no
quota cost) to translate library S/E to the show's primary catalogue
numbering, then drives subliminal to download from Addic7ed and writes
sidecars direct to nullstone via SSH.

Picker quirks: subliminal series-name matcher is broken by '!' in the
title, so the script strips it before building the synthetic
Video.fromname() string. OS feature_details S/E happens to align with
Addic7ed's indexing for the test show (American Dad).

Recipe README now reflects three paths in cheapest-first order: v3
Addic7ed, v2 OS REST (20/day), v1 plugin. American Dad run log updated
to 49/58 (S01 7/7 v1, S02 16/16 mixed v2/v3, S03 16/19 v3, S04 10/16
v3). 9 misses identified, deferred to next OS REST quota window.
2026-05-09 23:31:10 +01:00

103 lines
4.5 KiB
Markdown

# Subtitle process — changelog
## v1 — 2026-05-09
Initial recipe. Drafted while running on American Dad. Distilled from doc
03-subtitles.md (Futurama work) and the actual AD run.
Approach: Jellyfin RemoteSearch/Subtitles/eng → pick best non-HI/non-MT match
via Python filter → POST download → docker cp metadata cache → media folder →
delete cache dupes → validation refresh.
Scope: works on shows whose library season/episode numbering matches
OpenSubtitles' indexed numbering. Verified passing on AD S01 (7/7 episodes).
### Known break — added 2026-05-09 same day
After S01 passed, S02 returned 0 results for every episode probed (E01, E02,
E08, E13). Quota was fine (13 downloads remaining). Cause:
> Jellyfin metadata for American Dad uses **Hulu/DSP season ordering**
> (S1=7, S2=16, S3=19, S4=16). OpenSubtitles indexes by **Fox original-airing
> order** where S1 has 23 episodes. The plugin queries OS by
> `(parent_imdb_id, season_number, episode_number)`. For library S02E01
> "Bullocks to Stan" the plugin sends `S=2,E=1` but OS catalogues that
> episode as `S=1,E=8`. Result: 0 hits.
Each library episode has its own correct per-episode IMDB id (e.g.
`tt0511631` for "Bullocks to Stan") which would resolve directly via OS REST
`imdb_id=` parameter, but the plugin doesn't expose that path.
## v2 — 2026-05-09
Approach **A** chosen: direct OpenSubtitles REST API, per-episode `imdb_id`
lookup, bypass the Jellyfin plugin entirely. New helper at
`lib/sub-rest-fetch.py`.
- API key file: `~/.config/arrflix-opensubtitles-api.txt` (mode 600)
- Account: `Caveman5` (free tier, 20 downloads/day)
- Saves sidecars directly to nullstone media folder via `ssh ... cat >`
- No more docker-cp from `/config/metadata/library` cache (plugin path)
Recipe upgrade:
- Step 4 swaps `lib/sub-fetch.sh``lib/sub-rest-fetch.py` for shows with
non-standard season ordering.
- Picker logic identical: filter HI/MT/AI/Forced (renamed
`foreign_parts_only` in OS REST), prefer 23.976fps, sort by
`download_count` desc.
### v2 known quirks
- **OpenSubtitles `/download` endpoint rejects urllib** — consistent HTTP 503
via Python `urllib.request`, HTTP 200 via `curl` with same headers/body.
`_curl()` shim added; all OS API calls go through it. **Each 503 still
consumes 1 download-quota slot**, so this had to be fixed before retrying
large batches.
- `download_count` of `0` and `fps` of `0.0` appear on some catalogue
entries; treat as informational, not exclusionary.
- Some hits have `file_name` mismatching the `imdb_id` searched (OS metadata
drift). Recipe Step 6 visual-sync check is the catch.
### v2 known limits
- Free-tier 20/day still in force (REST and plugin share the counter).
- Recipe Step 6 (sync verification) is still manual — no automated check
that the picked .srt actually aligns with audio.
## v3 — 2026-05-09
Approach **Addic7ed via subliminal** added as a quota-free fallback. New
helper at `lib/sub-a7d-fetch.py`. Runs alongside v2; pick whichever fits.
- `subliminal` Python lib drives `addic7ed` provider, anonymous
- OS REST is still consulted (search-only, no quota cost) to translate
library Hulu numbering to the show's primary catalogue numbering, since
Addic7ed and OS feature_details appear to align for at least the test
show (American Dad)
- Sidecar written direct to nullstone via `ssh ... cat >`
### v3 picker / matching
- subliminal returns ordered candidates by match score; takes first
- "!" in series name breaks subliminal's matcher; recipe strips it before
building the synthetic filename for `Video.fromname()`
- Synthetic filename pattern: `Series.Name.Year.SXXEYY.HDTV.x264.mkv`
### v3 known quirks
- Some episodes return 0 hits at addic7ed for the OS-feat-details S/E we
pass — likely cases where addic7ed indexes by Fox airing order while OS
uses DVD-compressed (or vice versa). On American Dad, ~9 of 58 episodes
missed via this path. Fall back to v2 OS REST when quota allows.
- One episode (`Black Mystery Month`) had a hit but downloaded empty
content — addic7ed-side cataloguing error or temp 0-byte upload.
- Per-show coverage varies: Addic7ed has near-complete English on broadcast
US shows but spotty for animated specials and obscure titles.
### v3 known limits
- English coverage best; non-English near-empty
- Anonymous downloads work but heavy bursts may trigger Addic7ed's
bot detection and short IP throttle (~1 hour). The script makes no
effort at jittering / backoff
- No automated sync-quality check; recipe Step 6 still manual