legacy-arrflix/processes/subtitles/CHANGELOG.md
s8n 23520df2df processes/subtitles: v2 REST fetcher + AD S02E01-E12 subbed
Adds lib/sub-rest-fetch.py: direct OpenSubtitles REST, looks up subs by
per-episode IMDB id (e.g. tt0511631) instead of the plugin's
(parent_imdb_id, season, episode) combo path. This sidesteps shows where
library numbering diverges from OpenSubtitles' catalogued numbering --
American Dad uses Hulu S1=7 eps; OS uses Fox S1=23 eps; the plugin path
returns 0 hits past S01E07 even though every per-episode IMDB id is
correct.

Recipe README updated to surface the two paths (v1 plugin / v2 REST) and
recommend v2 by default. American Dad run log now shows 19/58 episodes
subbed (S01 7/7 via v1, S02E01-E12 via v2). S02E13-S04 (39 eps) deferred
to next 20/day quota windows.

Quirk fixed in v2: OpenSubtitles /download endpoint consistently returns
HTTP 503 to Python urllib.request despite identical headers/body via curl.
_curl() shim routes all OS API calls through curl. Each 503 still
consumes a download slot, so urllib path was unsafe to retry on.
2026-05-09 23:09:09 +01:00

65 lines
2.8 KiB
Markdown

# Subtitle process — changelog
## v1 — 2026-05-09
Initial recipe. Drafted while running on American Dad. Distilled from doc
03-subtitles.md (Futurama work) and the actual AD run.
Approach: Jellyfin RemoteSearch/Subtitles/eng → pick best non-HI/non-MT match
via Python filter → POST download → docker cp metadata cache → media folder →
delete cache dupes → validation refresh.
Scope: works on shows whose library season/episode numbering matches
OpenSubtitles' indexed numbering. Verified passing on AD S01 (7/7 episodes).
### Known break — added 2026-05-09 same day
After S01 passed, S02 returned 0 results for every episode probed (E01, E02,
E08, E13). Quota was fine (13 downloads remaining). Cause:
> Jellyfin metadata for American Dad uses **Hulu/DSP season ordering**
> (S1=7, S2=16, S3=19, S4=16). OpenSubtitles indexes by **Fox original-airing
> order** where S1 has 23 episodes. The plugin queries OS by
> `(parent_imdb_id, season_number, episode_number)`. For library S02E01
> "Bullocks to Stan" the plugin sends `S=2,E=1` but OS catalogues that
> episode as `S=1,E=8`. Result: 0 hits.
Each library episode has its own correct per-episode IMDB id (e.g.
`tt0511631` for "Bullocks to Stan") which would resolve directly via OS REST
`imdb_id=` parameter, but the plugin doesn't expose that path.
## v2 — 2026-05-09
Approach **A** chosen: direct OpenSubtitles REST API, per-episode `imdb_id`
lookup, bypass the Jellyfin plugin entirely. New helper at
`lib/sub-rest-fetch.py`.
- API key file: `~/.config/arrflix-opensubtitles-api.txt` (mode 600)
- Account: `Caveman5` (free tier, 20 downloads/day)
- Saves sidecars directly to nullstone media folder via `ssh ... cat >`
- No more docker-cp from `/config/metadata/library` cache (plugin path)
Recipe upgrade:
- Step 4 swaps `lib/sub-fetch.sh``lib/sub-rest-fetch.py` for shows with
non-standard season ordering.
- Picker logic identical: filter HI/MT/AI/Forced (renamed
`foreign_parts_only` in OS REST), prefer 23.976fps, sort by
`download_count` desc.
### v2 known quirks
- **OpenSubtitles `/download` endpoint rejects urllib** — consistent HTTP 503
via Python `urllib.request`, HTTP 200 via `curl` with same headers/body.
`_curl()` shim added; all OS API calls go through it. **Each 503 still
consumes 1 download-quota slot**, so this had to be fixed before retrying
large batches.
- `download_count` of `0` and `fps` of `0.0` appear on some catalogue
entries; treat as informational, not exclusionary.
- Some hits have `file_name` mismatching the `imdb_id` searched (OS metadata
drift). Recipe Step 6 visual-sync check is the catch.
### v2 known limits
- Free-tier 20/day still in force (REST and plugin share the counter).
- Recipe Step 6 (sync verification) is still manual — no automated check
that the picked .srt actually aligns with audio.