Original-release bitmap subs (PGS, VobSub, dvd_subtitle) are first-class, not stop-gaps. They're the canonical studio render — bitmap encoding is just a format choice, not a quality compromise. OCR'd or AI-rebuilt sidecars introduce transcription error that the source doesn't have. STYLE.md changes: - New "Source priority" section with 4 tiers: original text > original bitmap > trusted text rips > WhisperX rebuild. - "What lands on disk" loosened: at least one English stream (embedded OR sidecar), keep embedded codec as-is, sidecar still .srt. - New "OCR bitmap -> text" section documenting pgsrip recipe as an optional UX-nicety augmentation, not a correctness fix. - "Why these rules" now explains why original > pretty (esp. for older shows like Futurama S1-3 / early Archer where the master is the only authoritative source and upscale artifacts already dominate). STOPGAP-SUBS.md: header note clarifying bitmap-from-disc is NOT a stop-gap; lists Lilo & Stitch (2002) and Archer (2009) S02 as examples of correct-as-shipped library entries.
6.2 KiB
Subtitle USER-G style — ARRFLIX
The bar every fetch should hit. If a recipe step would violate any of these, stop and ask before proceeding.
Source priority (highest → lowest)
Accuracy beats format. Use this tier ladder before reaching for OCR/AI:
- Original release text subs (
.srt/.assfrom disc/streamer rip, embedded or sidecar). Ground truth — ship as-is. - Original release bitmap subs (PGS, VobSub,
dvd_subtitle, embedded). Acceptable in their native form — they ARE the original words from the source master, just rendered as images. Jellyfin server burns them in for clients that can't render natively. Optionally OCR-extract a.srtsidecar alongside (do NOT replace the embedded stream) when client-side styling, search, or mobile rendering matters. - Trusted text rips from OpenSubtitles (verified uploads, hash-match or high-download-count + frame-rate-match).
- WhisperX rebuild with
--initial_promptproper-nouns yaml — only when no original exists (e.g. user-uploaded YT content with auto-CC). Logged inSTOPGAP-SUBS.mduntil cleared.
Tier 1 and 2 are first-class. Tier 3 is a fallback. Tier 4 is a stop-gap.
What lands on disk
- At least one English subtitle stream per episode (embedded OR sidecar — not both required).
- Sidecar filename when used:
<videobasename>.eng.srt— no language-region tags (en-US), no flag stack on regular subs (no.sdh, no.forced, no.ccunless there genuinely is no plain-English option). - Sidecar format:
.srt(SubRip text). For embedded: keep the original codec (subrip,ass,pgs,vobsub,dvd_subtitle) — do NOT re-mux just to convert format. Convert only when extracting to disk: text codecs viaffmpeg -map 0:s:0 -c:s srt, bitmap codecs via OCR (see "OCR bitmap → text" below). - Encoding (sidecar): UTF-8. Re-encode with
iconvif a sidecar comes back as cp1252 / windows-1250.
What gets picked
In order:
- English language —
eng/en. Never auto-picken-US/en-GBvariants over plainen; treat them equivalent for matching. - No SDH / Hearing Impaired — drop any sub flagged
hearing_impaired,sdh,cc. Only fall back to SDH if no plain-English option exists. - No machine / AI translation — drop
machine_translated,ai_translated. Hand-authored subs only. - No forced subtitles — drop
foreign_parts_only/Forcedunless the episode has English audio with foreign-language scenes that need translation (rare for US shows). - Frame-rate match — prefer entries whose declared fps matches the
source video (typically 23.976 for our masters). Treat
0.0as unknown and fall through to step 6. - Highest download count within the surviving candidates — proxies for "the version everyone agreed was best."
After fetch, eyeball-verify one sample episode per show plays in sync (±1 s on a known dialogue line) before declaring the show done.
What doesn't ship
- Multiple language tracks per episode (no German/French alternatives —
English-only library). Drop non-English embedded streams via
mkvpropeditonly if user complains about client picker clutter; do NOT silently strip them on import. - Director's commentary, behind-the-scenes, song-only subs.
- Subs that cover only a partial runtime (the partial-cover heuristic isn't scripted yet; spot-check duration vs episode runtime if a srt looks short).
- "All-episodes-in-one" mega-packs treated as a single episode's sidecar.
OCR bitmap → text (optional, tier-2 augmentation)
Embedded PGS/VobSub/dvd_subtitle are acceptable as-is (tier 2). OCR
becomes worthwhile when: (a) client repeatedly transcodes due to bitmap
burn-in (CPU pressure on nullstone — no GPU transcode available), (b)
user wants to restyle font/size on a specific show, (c) mobile client
renders bitmap subs poorly.
Recipe (pgsrip, batch-friendly, Tesseract-backed):
pip install pgsrip
# PGS: extract embedded to .sup
ffmpeg -i input.mkv -map 0:s:0 -c copy subs.sup
pgsrip --language eng subs.sup # -> subs.srt
# VobSub/dvd_subtitle: extract to .idx + .sub
mkvextract input.mkv tracks 2:subs.idx
pgsrip subs.idx # -> subs.srt
OCR accuracy ~90–95 % raw, ~95–98 % after Subtitle Edit cleanup. Source
words are correct (it's transcription of original render, not Whisper
hallucination) — only font recognition fights you. Resulting .srt
ships as sidecar alongside the embedded bitmap stream, not as a
replacement.
How the UI presents subs
The detail-page subtitle dropdown is shimmed via
web-overrides/index.html (markers SUB-LABEL-SHIM-BEGIN/END). Stock
Jellyfin shows e.g. English - SUBRIP - External - Default; the shim
collapses to English, with (Forced) / (SDH) / (Hearing Impaired)
suffixes only when those flags actually apply. Default is dropped — it's
redundant when there's only one stream per language.
Revert: bin/revert-sub-label-shim.sh.
Why these rules
- Boutique-release-group quality bar from
README.md: "every show and film is the best version I could put together." - Original-release subs > pretty format. The DVD/BD/streamer master is the canonical script — bitmap or text, those are the words the studio shipped. An OCR'd or AI-rebuilt sidecar is a derivative that introduces error (font confusion, mistranscription); the original doesn't. Especially true for older shows (Futurama S1–S3, Archer early seasons) where the master is the only authoritative source and upscale artifacts already dominate the visual experience — bitmap subs match the source vibe.
- One-language library = one stream per ep = no need to expose codec or source in UI.
- SDH/CC adds
[door slams],[music]etc. — distracting on first watch and not what someone reaches for unless they specifically need it. - Machine / AI translations are inconsistent and often wrong on slang or show-specific terms (esp. animated comedies).
- Frame-rate-matched subs sync without manual offset on first try; mismatch generally still works on NTSC (29.97 vs 23.976 are the same elapsed time) but hash-match or fps-match removes that gamble.