processes/subtitles: STYLE.md — USER-G style for picks + display

Codifies the bar every fetch must hit:
- one plain English .srt per episode, named <base>.eng.srt
- drop SDH / HearingImpaired / MachineTranslated / AiTranslated / Forced
- prefer fps-matched, then highest download count
- UI label collapsed to just "English" via web-overrides shim

README.md now points at STYLE.md as a hard prereq before any fetch run.
The picker logic in v1/v2/v3 helpers already encodes these rules; STYLE.md
is the canonical source of truth that they should be checked against if
anyone (or future-me) is tempted to relax them.
This commit is contained in:
s8n 2026-05-10 00:13:51 +01:00
parent b3ead71b7e
commit 7ce1539ea7
2 changed files with 79 additions and 0 deletions

View file

@ -8,6 +8,12 @@ command, what to verify, and what to do on failure. Background reference for
how Jellyfin and the OpenSubtitles plugin work together lives in
[`docs/03-subtitles.md`](../../docs/03-subtitles.md).
> **Read [`STYLE.md`](STYLE.md) first.** Every fetch must hit the
> bar set there: one English `.srt` per episode, plain (no SDH / no MT / no
> AI / no Forced), best-quality release. The picker logic in v1/v2/v3
> mirrors that bar; if a step would violate it, stop and ask before
> downloading.
---
## Prereqs (verify before running)

View file

@ -0,0 +1,73 @@
# Subtitle USER-G style — ARRFLIX
The bar every fetch should hit. If a recipe step would violate any of these,
stop and ask before proceeding.
## What lands on disk
- **Exactly one** English subtitle file per episode.
- Filename: `<videobasename>.eng.srt` — no language-region tags (`en-US`),
no flag stack on regular subs (no `.sdh`, no `.forced`, no `.cc` unless
there genuinely is no plain-English option).
- Format: `.srt` (SubRip text). Skip `.ass`, `.ssa`, `.vtt`, `.sup`, `.idx`
unless the source has nothing else; convert with `ffmpeg -map 0:s:0 -c:s srt`
in that case.
- Encoding: UTF-8. Re-encode with `iconv` if a sidecar comes back as cp1252
/ windows-1250.
## What gets picked
In order:
1. **English** language — `eng` / `en`. Never auto-pick `en-US`/`en-GB`
variants over plain `en`; treat them equivalent for matching.
2. **No SDH / Hearing Impaired** — drop any sub flagged `hearing_impaired`,
`sdh`, `cc`. Only fall back to SDH if no plain-English option exists.
3. **No machine / AI translation** — drop `machine_translated`,
`ai_translated`. Hand-authored subs only.
4. **No forced subtitles** — drop `foreign_parts_only` / `Forced` unless the
episode has English audio with foreign-language scenes that need
translation (rare for US shows).
5. **Frame-rate match** — prefer entries whose declared fps matches the
source video (typically 23.976 for our masters). Treat `0.0` as unknown
and fall through to step 6.
6. **Highest download count** within the surviving candidates — proxies for
"the version everyone agreed was best."
After fetch, **eyeball-verify one sample episode per show** plays in sync
(±1 s on a known dialogue line) before declaring the show done.
## What doesn't ship
- Multiple language tracks per episode (no German/French alternatives —
English-only library).
- Director's commentary, behind-the-scenes, song-only subs.
- Subs that cover only a partial runtime (the partial-cover heuristic isn't
scripted yet; spot-check duration vs episode runtime if a srt looks short).
- "All-episodes-in-one" mega-packs treated as a single episode's sidecar.
## How the UI presents subs
The detail-page subtitle dropdown is shimmed via
`web-overrides/index.html` (markers `SUB-LABEL-SHIM-BEGIN/END`). Stock
Jellyfin shows e.g. `English - SUBRIP - External - Default`; the shim
collapses to `English`, with `(Forced)` / `(SDH)` / `(Hearing Impaired)`
suffixes only when those flags actually apply. `Default` is dropped — it's
redundant when there's only one stream per language.
Revert: `bin/revert-sub-label-shim.sh`.
## Why these rules
- Boutique-release-group quality bar from
[`README.md`](../../README.md): "every show and film is the best version
I could put together."
- One-language library = one stream per ep = no need to expose codec or
source in UI.
- SDH/CC adds `[door slams]`, `[music]` etc. — distracting on first watch
and not what someone reaches for unless they specifically need it.
- Machine / AI translations are inconsistent and often wrong on slang or
show-specific terms (esp. animated comedies).
- Frame-rate-matched subs sync without manual offset on first try; mismatch
generally still works on NTSC (29.97 vs 23.976 are the same elapsed time)
but hash-match or fps-match removes that gamble.