diff --git a/processes/subtitles/README.md b/processes/subtitles/README.md index c836b6c..234b917 100644 --- a/processes/subtitles/README.md +++ b/processes/subtitles/README.md @@ -8,6 +8,12 @@ command, what to verify, and what to do on failure. Background reference for how Jellyfin and the OpenSubtitles plugin work together lives in [`docs/03-subtitles.md`](../../docs/03-subtitles.md). +> **Read [`STYLE.md`](STYLE.md) first.** Every fetch must hit the +> bar set there: one English `.srt` per episode, plain (no SDH / no MT / no +> AI / no Forced), best-quality release. The picker logic in v1/v2/v3 +> mirrors that bar; if a step would violate it, stop and ask before +> downloading. + --- ## Prereqs (verify before running) diff --git a/processes/subtitles/STYLE.md b/processes/subtitles/STYLE.md new file mode 100644 index 0000000..ac7dd69 --- /dev/null +++ b/processes/subtitles/STYLE.md @@ -0,0 +1,73 @@ +# Subtitle USER-G style — ARRFLIX + +The bar every fetch should hit. If a recipe step would violate any of these, +stop and ask before proceeding. + +## What lands on disk + +- **Exactly one** English subtitle file per episode. +- Filename: `.eng.srt` — no language-region tags (`en-US`), + no flag stack on regular subs (no `.sdh`, no `.forced`, no `.cc` unless + there genuinely is no plain-English option). +- Format: `.srt` (SubRip text). Skip `.ass`, `.ssa`, `.vtt`, `.sup`, `.idx` + unless the source has nothing else; convert with `ffmpeg -map 0:s:0 -c:s srt` + in that case. +- Encoding: UTF-8. Re-encode with `iconv` if a sidecar comes back as cp1252 + / windows-1250. + +## What gets picked + +In order: + +1. **English** language — `eng` / `en`. Never auto-pick `en-US`/`en-GB` + variants over plain `en`; treat them equivalent for matching. +2. **No SDH / Hearing Impaired** — drop any sub flagged `hearing_impaired`, + `sdh`, `cc`. Only fall back to SDH if no plain-English option exists. +3. **No machine / AI translation** — drop `machine_translated`, + `ai_translated`. Hand-authored subs only. +4. **No forced subtitles** — drop `foreign_parts_only` / `Forced` unless the + episode has English audio with foreign-language scenes that need + translation (rare for US shows). +5. **Frame-rate match** — prefer entries whose declared fps matches the + source video (typically 23.976 for our masters). Treat `0.0` as unknown + and fall through to step 6. +6. **Highest download count** within the surviving candidates — proxies for + "the version everyone agreed was best." + +After fetch, **eyeball-verify one sample episode per show** plays in sync +(±1 s on a known dialogue line) before declaring the show done. + +## What doesn't ship + +- Multiple language tracks per episode (no German/French alternatives — + English-only library). +- Director's commentary, behind-the-scenes, song-only subs. +- Subs that cover only a partial runtime (the partial-cover heuristic isn't + scripted yet; spot-check duration vs episode runtime if a srt looks short). +- "All-episodes-in-one" mega-packs treated as a single episode's sidecar. + +## How the UI presents subs + +The detail-page subtitle dropdown is shimmed via +`web-overrides/index.html` (markers `SUB-LABEL-SHIM-BEGIN/END`). Stock +Jellyfin shows e.g. `English - SUBRIP - External - Default`; the shim +collapses to `English`, with `(Forced)` / `(SDH)` / `(Hearing Impaired)` +suffixes only when those flags actually apply. `Default` is dropped — it's +redundant when there's only one stream per language. + +Revert: `bin/revert-sub-label-shim.sh`. + +## Why these rules + +- Boutique-release-group quality bar from + [`README.md`](../../README.md): "every show and film is the best version + I could put together." +- One-language library = one stream per ep = no need to expose codec or + source in UI. +- SDH/CC adds `[door slams]`, `[music]` etc. — distracting on first watch + and not what someone reaches for unless they specifically need it. +- Machine / AI translations are inconsistent and often wrong on slang or + show-specific terms (esp. animated comedies). +- Frame-rate-matched subs sync without manual offset on first try; mismatch + generally still works on NTSC (29.97 vs 23.976 are the same elapsed time) + but hash-match or fps-match removes that gamble.