Adds lib/sub-rest-fetch.py: direct OpenSubtitles REST, looks up subs by per-episode IMDB id (e.g. tt0511631) instead of the plugin's (parent_imdb_id, season, episode) combo path. This sidesteps shows where library numbering diverges from OpenSubtitles' catalogued numbering -- American Dad uses Hulu S1=7 eps; OS uses Fox S1=23 eps; the plugin path returns 0 hits past S01E07 even though every per-episode IMDB id is correct. Recipe README updated to surface the two paths (v1 plugin / v2 REST) and recommend v2 by default. American Dad run log now shows 19/58 episodes subbed (S01 7/7 via v1, S02E01-E12 via v2). S02E13-S04 (39 eps) deferred to next 20/day quota windows. Quirk fixed in v2: OpenSubtitles /download endpoint consistently returns HTTP 503 to Python urllib.request despite identical headers/body via curl. _curl() shim routes all OS API calls through curl. Each 503 still consumes a download slot, so urllib path was unsafe to retry on.
181 lines
6.7 KiB
Markdown
181 lines
6.7 KiB
Markdown
# Subtitle acquisition process — v1
|
||
|
||
Last updated: 2026-05-09
|
||
Status: **v2** — direct REST API. American Dad S01–S02 (19/58 eps) subbed. S02E13–S04 awaiting next quota window.
|
||
|
||
This recipe is written for Claude Code to execute. Each step lists the exact
|
||
command, what to verify, and what to do on failure. Background reference for
|
||
how Jellyfin and the OpenSubtitles plugin work together lives in
|
||
[`docs/03-subtitles.md`](../../docs/03-subtitles.md).
|
||
|
||
---
|
||
|
||
## Prereqs (verify before running)
|
||
|
||
| Check | How |
|
||
|---|---|
|
||
| OpenSubtitles plugin v20 installed + Active | `docker exec jellyfin ls /config/plugins | grep -i opensub` |
|
||
| Plugin creds saved (`Caveman5`) | `docker exec jellyfin grep -E 'Username\|CredentialsInvalid' /config/plugins/configurations/Jellyfin.Plugin.OpenSubtitles.xml` — expect `Caveman5` and `false` |
|
||
| TV library has `SaveSubtitlesWithMedia=true`, `SubtitleDownloadLanguages=["eng"]`, `RequirePerfectSubtitleMatch=false` | `curl -s -H "X-Emby-Token: $TOK" http://localhost:8096/Library/VirtualFolders` |
|
||
| Free-tier quota remaining today (≥ episode count, else plan multi-day) | `docker logs --tail 200 jellyfin 2>&1 \| grep "Remaining downloads" \| tail -1` (free = 20/day, resets 00:00 UTC) |
|
||
| Source files have audio language tag | `ffprobe` sample episode |
|
||
|
||
If any prereq fails, stop. Fix it before running the recipe.
|
||
|
||
---
|
||
|
||
## Step 1 — Probe the source
|
||
|
||
Pick one episode of the target show. Run `ffprobe` on it:
|
||
|
||
```bash
|
||
ssh user@192.168.0.100 'docker exec jellyfin /usr/lib/jellyfin-ffmpeg/ffprobe -hide_banner "<path-to-mkv>" 2>&1 | grep -E "Stream|Duration"'
|
||
```
|
||
|
||
Record in the run log:
|
||
|
||
- video codec + resolution + frame rate
|
||
- audio language tag(s)
|
||
- whether any subtitle streams are embedded
|
||
- container
|
||
|
||
Decide based on probe:
|
||
|
||
| Probe result | Branch |
|
||
|---|---|
|
||
| English audio, no embedded subs | "simple" path (this recipe) |
|
||
| Foreign-dub audio, no embedded subs | "foreign-dub" path (deferred to v?) |
|
||
| Embedded English subs already present | skip — Jellyfin will use them |
|
||
| Embedded PGS/VobSub bitmap subs | "OCR" path (deferred to v?) |
|
||
|
||
---
|
||
|
||
## Step 2 — Resolve series + episode IDs
|
||
|
||
```bash
|
||
TOK=<jellyfin-admin-token>
|
||
SERIES_NAME='American Dad'
|
||
ssh user@192.168.0.100 "docker exec jellyfin curl -s -H 'X-Emby-Token: $TOK' \
|
||
'http://localhost:8096/Items?searchTerm=${SERIES_NAME// /+}&IncludeItemTypes=Series&Recursive=true&Limit=3'" \
|
||
| python3 -c "import json,sys; [print(x['Id'],x['Name']) for x in json.load(sys.stdin).get('Items',[])]"
|
||
```
|
||
|
||
Record series Id. Then list episodes:
|
||
|
||
```bash
|
||
SERIES=<series-id>
|
||
ssh user@192.168.0.100 "docker exec jellyfin curl -s -H 'X-Emby-Token: $TOK' \
|
||
'http://localhost:8096/Items?ParentId=$SERIES&IncludeItemTypes=Episode&Recursive=true&Fields=Path,ParentIndexNumber,IndexNumber'" \
|
||
| python3 -c "import json,sys; [print(e['Id'],'S%02dE%02d'%(e['ParentIndexNumber'],e['IndexNumber']),e['Name']) for e in json.load(sys.stdin)['Items']]"
|
||
```
|
||
|
||
---
|
||
|
||
## Step 3 — Pick fetch path
|
||
|
||
Two paths, differ in robustness vs simplicity:
|
||
|
||
| Path | When to use | Tool |
|
||
|---|---|---|
|
||
| **v1 (plugin)** | Library season/episode numbering matches OpenSubtitles indexing AND every episode has good IMDB ProviderId | `lib/sub-fetch.sh` |
|
||
| **v2 (REST)** | Default. Survives Hulu/Fox numbering mismatches and shows with weird ordering | `lib/sub-rest-fetch.py` |
|
||
|
||
Quick check whether v1 will work:
|
||
|
||
1. Pick the first episode of season 2 in the library.
|
||
2. Run `curl -s -H 'X-Emby-Token: $TOK' 'http://localhost:8096/Items/$EP/RemoteSearch/Subtitles/eng'` (read-only).
|
||
3. If results > 0 — v1 works. v2 also works.
|
||
4. If results == 0 but the show exists on opensubtitles.com — numbering mismatch (e.g. American Dad: library uses Hulu S1=7 eps; OS uses Fox S1=23). Use **v2**.
|
||
|
||
When in doubt, use v2.
|
||
|
||
---
|
||
|
||
## Step 4 — Fetch subs per episode
|
||
|
||
Use `lib/sub-rest-fetch.py` (v2). It logs in to OpenSubtitles, looks each
|
||
episode up by its per-episode IMDB id, picks the best English match, and
|
||
writes the sidecar straight to nullstone.
|
||
|
||
```bash
|
||
JELLYFIN_TOKEN=<admin-token> \
|
||
OPENSUBTITLES_API_KEY=$HOME/.config/arrflix-opensubtitles-api.txt \
|
||
OPENSUBTITLES_USER=Caveman5 \
|
||
OPENSUBTITLES_PASS=<password> \
|
||
processes/subtitles/lib/sub-rest-fetch.py <series-id> --season N [--start E] [--end E]
|
||
```
|
||
|
||
Pre-flight with `DRY_RUN=1` to see picks without consuming quota.
|
||
|
||
The legacy v1 path (Jellyfin plugin RemoteSearch + docker cp) lives at
|
||
`lib/sub-fetch.sh` and is kept for shows where library numbering matches
|
||
OpenSubtitles' indexing — slightly less general but doesn't depend on the
|
||
external OS REST API or our 20/day account quota.
|
||
|
||
Verify after each batch:
|
||
|
||
```bash
|
||
ssh user@192.168.0.100 'ls "<media-dir>/" | grep -c eng.srt'
|
||
```
|
||
|
||
---
|
||
|
||
## Step 5 — Library scan + de-dup (v1 only)
|
||
|
||
If you used the v1 plugin path, the metadata-cache copy and the media-folder
|
||
sidecar both register as subtitle streams in Jellyfin (counted twice).
|
||
Delete the cache copies:
|
||
|
||
```bash
|
||
ssh user@192.168.0.100 'docker exec jellyfin bash -c "find /config/metadata/library -path \"*<show-name>*S0[1-9]E*.eng.srt\" -delete -print"'
|
||
```
|
||
|
||
v2 writes directly to the media folder so there is no cache copy to clean.
|
||
|
||
Trigger a validation-only refresh so Jellyfin sees the new sidecars:
|
||
|
||
```bash
|
||
ssh user@192.168.0.100 "docker exec jellyfin curl -s -X POST -H 'X-Emby-Token: $TOK' \
|
||
'http://localhost:8096/Items/$SERIES/Refresh?MetadataRefreshMode=ValidationOnly&Recursive=true'"
|
||
```
|
||
|
||
Confirm one episode has exactly 1 external eng sub stream:
|
||
|
||
```bash
|
||
ssh user@192.168.0.100 "docker exec jellyfin curl -s -H 'X-Emby-Token: $TOK' \
|
||
'http://localhost:8096/Items/<sample-ep-id>?Fields=MediaStreams'" \
|
||
| python3 -c "import json,sys; subs=[s for s in json.load(sys.stdin).get('MediaStreams',[]) if s['Type']=='Subtitle']; print(len(subs),'sub streams')"
|
||
```
|
||
|
||
---
|
||
|
||
## Step 6 — Quality gate
|
||
|
||
For the run to pass:
|
||
|
||
- [ ] **Coverage**: every episode has a matching `<base>.eng.srt` sidecar
|
||
- [ ] **Sync sample**: at least one episode of each season is opened in
|
||
Jellyfin web and subs visually align with audio (±1 s) on a known dialogue
|
||
line
|
||
- [ ] **Flag check**: no `.sdh.srt`, `.forced.srt`, or `.hi.srt` files
|
||
(machine pick should have filtered)
|
||
- [ ] **Stream count**: Jellyfin shows exactly 1 external eng sub per episode
|
||
|
||
If any check fails, log it in `runs/<show>.md` under "breakage" and propose
|
||
the recipe amendment in `CHANGELOG.md`.
|
||
|
||
---
|
||
|
||
## Quota hygiene
|
||
|
||
Free OpenSubtitles.com account = 20 downloads / day, resets 00:00 UTC.
|
||
Plan large series across multiple days, or switch to VIP (~$3/mo, unlimited).
|
||
|
||
Quota check:
|
||
|
||
```bash
|
||
ssh user@192.168.0.100 'docker logs --tail 200 jellyfin 2>&1 | grep "Remaining downloads" | tail -1'
|
||
```
|
||
|
||
When quota hits 0 the API returns 0 results, indistinguishable from a real
|
||
miss. Always check quota before declaring a "no subs" failure.
|