Adds processes/ umbrella for repeatable acquisition workflows. First child is subtitles/, with recipe README (executable by Claude Code), CHANGELOG, per-show run logs, and a tested helper at lib/sub-fetch.sh. Run on American Dad: S01 (7 eps) passed, S02-S04 (51 eps) broke. Library uses Hulu/DSP season ordering; OpenSubtitles indexes by Fox airing order; plugin queries by (parent_imdb_id, season, episode) so library S02E01 returns 0 hits. v2 design = direct OpenSubtitles REST with per-episode imdb_id lookup; pending API-key registration.
193 lines
7.3 KiB
Markdown
193 lines
7.3 KiB
Markdown
# Subtitle acquisition process — v1
|
|
|
|
Last updated: 2026-05-09
|
|
Status: **v1, partial** — passed American Dad S01 (7/7 eps), broke on S02E01 due to season-numbering mismatch. v2 design pending.
|
|
|
|
This recipe is written for Claude Code to execute. Each step lists the exact
|
|
command, what to verify, and what to do on failure. Background reference for
|
|
how Jellyfin and the OpenSubtitles plugin work together lives in
|
|
[`docs/03-subtitles.md`](../../docs/03-subtitles.md).
|
|
|
|
---
|
|
|
|
## Prereqs (verify before running)
|
|
|
|
| Check | How |
|
|
|---|---|
|
|
| OpenSubtitles plugin v20 installed + Active | `docker exec jellyfin ls /config/plugins | grep -i opensub` |
|
|
| Plugin creds saved (`Caveman5`) | `docker exec jellyfin grep -E 'Username\|CredentialsInvalid' /config/plugins/configurations/Jellyfin.Plugin.OpenSubtitles.xml` — expect `Caveman5` and `false` |
|
|
| TV library has `SaveSubtitlesWithMedia=true`, `SubtitleDownloadLanguages=["eng"]`, `RequirePerfectSubtitleMatch=false` | `curl -s -H "X-Emby-Token: $TOK" http://localhost:8096/Library/VirtualFolders` |
|
|
| Free-tier quota remaining today (≥ episode count, else plan multi-day) | `docker logs --tail 200 jellyfin 2>&1 \| grep "Remaining downloads" \| tail -1` (free = 20/day, resets 00:00 UTC) |
|
|
| Source files have audio language tag | `ffprobe` sample episode |
|
|
|
|
If any prereq fails, stop. Fix it before running the recipe.
|
|
|
|
---
|
|
|
|
## Step 1 — Probe the source
|
|
|
|
Pick one episode of the target show. Run `ffprobe` on it:
|
|
|
|
```bash
|
|
ssh user@192.168.0.100 'docker exec jellyfin /usr/lib/jellyfin-ffmpeg/ffprobe -hide_banner "<path-to-mkv>" 2>&1 | grep -E "Stream|Duration"'
|
|
```
|
|
|
|
Record in the run log:
|
|
|
|
- video codec + resolution + frame rate
|
|
- audio language tag(s)
|
|
- whether any subtitle streams are embedded
|
|
- container
|
|
|
|
Decide based on probe:
|
|
|
|
| Probe result | Branch |
|
|
|---|---|
|
|
| English audio, no embedded subs | "simple" path (this recipe) |
|
|
| Foreign-dub audio, no embedded subs | "foreign-dub" path (deferred to v?) |
|
|
| Embedded English subs already present | skip — Jellyfin will use them |
|
|
| Embedded PGS/VobSub bitmap subs | "OCR" path (deferred to v?) |
|
|
|
|
---
|
|
|
|
## Step 2 — Resolve series + episode IDs
|
|
|
|
```bash
|
|
TOK=<jellyfin-admin-token>
|
|
SERIES_NAME='American Dad'
|
|
ssh user@192.168.0.100 "docker exec jellyfin curl -s -H 'X-Emby-Token: $TOK' \
|
|
'http://localhost:8096/Items?searchTerm=${SERIES_NAME// /+}&IncludeItemTypes=Series&Recursive=true&Limit=3'" \
|
|
| python3 -c "import json,sys; [print(x['Id'],x['Name']) for x in json.load(sys.stdin).get('Items',[])]"
|
|
```
|
|
|
|
Record series Id. Then list episodes:
|
|
|
|
```bash
|
|
SERIES=<series-id>
|
|
ssh user@192.168.0.100 "docker exec jellyfin curl -s -H 'X-Emby-Token: $TOK' \
|
|
'http://localhost:8096/Items?ParentId=$SERIES&IncludeItemTypes=Episode&Recursive=true&Fields=Path,ParentIndexNumber,IndexNumber'" \
|
|
| python3 -c "import json,sys; [print(e['Id'],'S%02dE%02d'%(e['ParentIndexNumber'],e['IndexNumber']),e['Name']) for e in json.load(sys.stdin)['Items']]"
|
|
```
|
|
|
|
---
|
|
|
|
## Step 3 — Validate season numbering against OpenSubtitles
|
|
|
|
> ⚠️ **Critical, added in v2** (currently provisional — see CHANGELOG): some shows
|
|
> are catalogued differently across services. American Dad is the canonical
|
|
> example: Hulu/DSP carriers split the original Fox 23-ep S1 into Hulu S1 (7
|
|
> eps) + S2 (16 eps). OpenSubtitles indexes by Fox airing order. The plugin
|
|
> queries by `(parent_imdb_id, season, episode)` so library-side Hulu numbering
|
|
> returns 0 results past the first 7 episodes.
|
|
|
|
How to check:
|
|
|
|
1. Pick the first episode of season 2 in the library.
|
|
2. Run a `RemoteSearch/Subtitles/eng` against it (Step 4 below, but read-only).
|
|
3. If results > 0 — numbering matches OpenSubtitles. Proceed.
|
|
4. If results == 0 but the show exists on opensubtitles.com — numbering mismatch. **Stop**. Fix metadata first or use the v2 direct-API path (TBD).
|
|
|
|
---
|
|
|
|
## Step 4 — Fetch subs per episode
|
|
|
|
Per-episode loop. Helper script lives at `processes/subtitles/lib/sub-fetch.sh`
|
|
(promoted from `/tmp` once stable; see CHANGELOG v0→v1).
|
|
|
|
```bash
|
|
TOK=<token>
|
|
EP=<episode-id>
|
|
MEDIA_DIR='/home/user/media/tv/<Show>/Season XX'
|
|
MEDIA_BASE='<Show> - SxxExx - <Title>'
|
|
|
|
# 1. search
|
|
RAW=$(ssh user@192.168.0.100 "docker exec jellyfin curl -s -H 'X-Emby-Token: $TOK' \
|
|
'http://localhost:8096/Items/$EP/RemoteSearch/Subtitles/eng'")
|
|
|
|
# 2. pick best non-HI/non-MT/non-AI/non-Forced match, prefer 23.976fps, then highest DownloadCount
|
|
SUBID=$(printf '%s' "$RAW" | python3 -c "
|
|
import json,sys
|
|
subs=json.load(sys.stdin)
|
|
clean=[s for s in subs if not (s.get('HearingImpaired') or s.get('MachineTranslated') or s.get('AiTranslated') or s.get('Forced'))]
|
|
if not clean: clean=subs
|
|
fps2398=[s for s in clean if abs(s.get('FrameRate',0)-23.976)<0.01]
|
|
pool=fps2398 if fps2398 else clean
|
|
pool.sort(key=lambda s: -s.get('DownloadCount',0))
|
|
print(pool[0]['Id'] if pool else '')")
|
|
|
|
# 3. download (returns 204)
|
|
ssh user@192.168.0.100 "docker exec jellyfin curl -s -X POST -H 'X-Emby-Token: $TOK' \
|
|
'http://localhost:8096/Items/$EP/RemoteSearch/Subtitles/$SUBID' -w 'HTTP %{http_code}\n'"
|
|
|
|
# 4. plugin saves to /config/metadata/library/<shard>/<itemId>/<base>.eng.srt
|
|
# NOT next to media (manual-pick path ignores SaveSubtitlesWithMedia).
|
|
# Move it into place:
|
|
SHARD="${EP:0:2}"
|
|
ssh user@192.168.0.100 "docker cp \"jellyfin:/config/metadata/library/$SHARD/$EP/$MEDIA_BASE.eng.srt\" \
|
|
\"$MEDIA_DIR/\""
|
|
```
|
|
|
|
Verify after each batch:
|
|
|
|
```bash
|
|
ssh user@192.168.0.100 'ls "<media-dir>/" | grep -c eng.srt'
|
|
```
|
|
|
|
---
|
|
|
|
## Step 5 — Clean up duplicates + library scan
|
|
|
|
The metadata-cache copy and the media-folder sidecar both register as
|
|
subtitle streams in Jellyfin (counted twice). Delete the cache copies:
|
|
|
|
```bash
|
|
ssh user@192.168.0.100 'docker exec jellyfin bash -c "find /config/metadata/library -path \"*<show-name>*S0[1-9]E*.eng.srt\" -delete -print"'
|
|
```
|
|
|
|
Trigger a validation-only refresh so Jellyfin sees the new sidecars:
|
|
|
|
```bash
|
|
ssh user@192.168.0.100 "docker exec jellyfin curl -s -X POST -H 'X-Emby-Token: $TOK' \
|
|
'http://localhost:8096/Items/$SERIES/Refresh?MetadataRefreshMode=ValidationOnly&Recursive=true'"
|
|
```
|
|
|
|
Confirm one episode has exactly 1 external eng sub stream:
|
|
|
|
```bash
|
|
ssh user@192.168.0.100 "docker exec jellyfin curl -s -H 'X-Emby-Token: $TOK' \
|
|
'http://localhost:8096/Items/<sample-ep-id>?Fields=MediaStreams'" \
|
|
| python3 -c "import json,sys; subs=[s for s in json.load(sys.stdin).get('MediaStreams',[]) if s['Type']=='Subtitle']; print(len(subs),'sub streams')"
|
|
```
|
|
|
|
---
|
|
|
|
## Step 6 — Quality gate
|
|
|
|
For the run to pass:
|
|
|
|
- [ ] **Coverage**: every episode has a matching `<base>.eng.srt` sidecar
|
|
- [ ] **Sync sample**: at least one episode of each season is opened in
|
|
Jellyfin web and subs visually align with audio (±1 s) on a known dialogue
|
|
line
|
|
- [ ] **Flag check**: no `.sdh.srt`, `.forced.srt`, or `.hi.srt` files
|
|
(machine pick should have filtered)
|
|
- [ ] **Stream count**: Jellyfin shows exactly 1 external eng sub per episode
|
|
|
|
If any check fails, log it in `runs/<show>.md` under "breakage" and propose
|
|
the recipe amendment in `CHANGELOG.md`.
|
|
|
|
---
|
|
|
|
## Quota hygiene
|
|
|
|
Free OpenSubtitles.com account = 20 downloads / day, resets 00:00 UTC.
|
|
Plan large series across multiple days, or switch to VIP (~$3/mo, unlimited).
|
|
|
|
Quota check:
|
|
|
|
```bash
|
|
ssh user@192.168.0.100 'docker logs --tail 200 jellyfin 2>&1 | grep "Remaining downloads" | tail -1'
|
|
```
|
|
|
|
When quota hits 0 the API returns 0 results, indistinguishable from a real
|
|
miss. Always check quota before declaring a "no subs" failure.
|