docs(32): import lex-fridman-podcast s01 (5 eps) to stock Jellyfin Podcasts
Some checks are pending
secret-scan / gitleaks (HEAD + history) (push) Waiting to run
secret-scan / detect-secrets (entropy + cross-tool) (push) Waiting to run
secret-scan / summary (push) Blocked by required conditions

This commit is contained in:
s8n 2026-05-11 15:45:23 +01:00
parent 6e336d1798
commit 690ea117c3

View file

@ -0,0 +1,151 @@
# lex-fridman-podcast-s01-yt-import
Second YouTube import into the **STOCK** Jellyfin at `tv.s8n.ru` (container
`jellyfin-stock`), this time targeting the **Podcasts** library. Five episodes
of the Lex Fridman Podcast — treated as one Series (`Lex Fridman Podcast`) on
Season 01, with the podcast episode number used directly as the episode index
(E400, E461, E478, E479, E481).
Independent from arrflix prod (`arrflix.s8n.ru`) and arrflix dev. Stock
Jellyfin's Podcasts library has `EnableInternetProviders=false` — files land
with filename/folder-only metadata. **No TMDb/TVDB matching is expected or
attempted.** Mirrors the pattern set by `benn-jordan-s01-yt-import.md`
(commit `6e336d1`).
## Provenance
- **Source:** YouTube channel "Lex Fridman" (podcast back catalogue picks)
- **Tool:** `yt-dlp` 2026.03.17 on onyx
- **Format selector (1080p cap):** `bv*[height<=1080][ext=mp4]+ba[ext=m4a]/b[height<=1080][ext=mp4]/bv*[height<=1080]+ba/b[height<=1080]/b``--merge-output-format mp4`
- **Subs:** `--write-subs --sub-langs "en.*" --embed-subs --convert-subs srt` — user-uploaded en subs embedded as `mov_text` when present (4/5 eps); E478 had no en track on YouTube.
- **Staging path on onyx:** `/home/admin/staging-jelly/Lex Fridman Podcast/Season 01/`
- **Parallel downloads:** 5 jobs spawned simultaneously, master wrapper `wait`-blocked until ALL exited 0 before rsync (lesson from prior run — never race rsync against in-flight downloads).
### Source URLs
| Episode | Video ID | URL |
|---|---|---|
| S01E400 | JN3KPFbWCy8 | https://www.youtube.com/watch?v=JN3KPFbWCy8 |
| S01E461 | tNZnLkRBYA8 | https://www.youtube.com/watch?v=tNZnLkRBYA8 |
| S01E478 | jdCKiEJpwf4 | https://www.youtube.com/watch?v=jdCKiEJpwf4 |
| S01E479 | HsLgZzgpz9Y | https://www.youtube.com/watch?v=HsLgZzgpz9Y |
| S01E481 | SvKv7D4pBjE | https://www.youtube.com/watch?v=SvKv7D4pBjE |
Original YouTube titles had ` | Lex Fridman Podcast #XXX` suffix and a
`Guest:` colon — stripped/replaced before filename construction per
playbook filename rules (no forbidden chars `< > : " / \ | ? *`). Ampersand,
comma, apostrophe, hyphen all preserved.
## Target
- **Server:** `jellyfin-stock` (container) on nullstone, exposed at `https://tv.s8n.ru`
- **Library:** Podcasts (tvshows-type, internet providers disabled)
- **Path on host:** `/home/user/media/podcasts/Lex Fridman Podcast/Season 01/`
- **Container view:** `/media/podcasts/Lex Fridman Podcast/Season 01/`
- **Series Item ID:** `6c01ab0084d87b94c124948f64f87c15`
- **Season Item ID:** `67d2aaba01fe73f2ba90e36514823632`
### Per-episode landing
| Episode | File size | Duration (spec) | Duration (Jellyfin) | Item ID |
|---|---:|---:|---:|---|
| S01E400 — Elon Musk - War, AI, Aliens, Politics, Physics, Video Games, and Humanity | 419,097,052 B (~400 MiB) | 8206 s | 8206 s | `5266b338705003d6fd04e315a01cd7fe` |
| S01E461 — ThePrimeagen - Programming, AI, ADHD, Productivity, Addiction, and God | 1,196,404,821 B (~1.11 GiB) | 19208 s | 19207 s | `b68e7628784ebdfafa21c3412bcb31f0` |
| S01E478 — Scott Horton - The Case Against War and the Military Industrial Complex | 1,830,927,069 B (~1.70 GiB) | 37591 s | 37590 s | `9baab6a35e3c0f32f4776e9aa379745d` |
| S01E479 — Dave Plummer - Programming, Autism, and Old-School Microsoft Stories | 583,179,948 B (~556 MiB) | 6628 s | 6628 s | `f33bf1d068c3c4771c8744f655256829` |
| S01E481 — Norman Ohler - Hitler, Nazis, Drugs, WW2, Blitzkrieg, LSD, MKUltra & CIA | 933,193,939 B (~890 MiB) | 15944 s | 15944 s | `b5946af6a55919391b227c7893a73059` |
Total on disk ~4.74 GB across 5 mp4s. The 1080p cap kept the 10.4-hour E478
to 1.7 GB — at 4K this would have ballooned past 50 GB.
Jellyfin's ffprobe is off by 1 s on E461/E478 (rounding-down vs YouTube's
declared seconds) — within tolerance, no correction needed.
## Counts
| | Before | After | Delta |
|---|---:|---:|---:|
| SeriesCount (Podcasts) | 0 | 1 | +1 |
| EpisodeCount (Podcasts) | 0 | 5 | +5 |
(First import into the Podcasts library; pre-state empty.)
## Stream sample (S01E479)
```
major_brand : isom
Duration: 01:50:28.48, start: 0.000000, bitrate: 703 kb/s
Stream #0:0[0x1](und): Video: av1 (libdav1d) (Main) (av01 / 0x31307661), yuv420p(tv, bt709), 1920x1080, 568 kb/s, 30 fps
Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 127 kb/s
Stream #0:2[0x3](eng): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s
```
AV1 1080p30 ~568 kb/s + AAC stereo ~127 kb/s + embedded `mov_text` en subs.
Source is YouTube's best 1080p mp4/m4a combo. AV1 direct-play requires a
recent client (Chromium ≥ 90, Firefox ≥ 100, Apple Silicon Safari, Android
12+, modern smart-TVs); otherwise `jellyfin-stock` will CPU-transcode (no GPU
mount per SYSTEM.md).
## Subtitle status
- Embedded `mov_text` (en): **yes** for E400 / E461 / E479 / E481 (user-uploaded en track present on the YouTube upload — yt-dlp embedded via `--embed-subs`).
- E478: no en track available on YouTube — no embedded sub. Player will fall back to no subs unless auto-CC sidecar is fetched later (`--write-auto-subs --sub-langs "en.*"`).
- External sidecar: none.
- Action: leave as-is for now. If E478 subs become required, re-fetch auto-CC and drop a `.eng.srt` next to the mp4 per `playbooks/subtitles/`.
## Verification checks
- [x] Folder/filename canonical (`Lex Fridman Podcast/Season 01/Lex Fridman Podcast - S01E<NNN> - <Title>.mp4`)
- [x] Permissions `user:user` 644 / 755 on nullstone
- [x] `Scan Media Library` task triggered via `/ScheduledTasks/Running/$SCAN_ID` (HTTP 204) — completed at 2026-05-11T14:43:12Z
- [x] **Note:** initial scan created Series + Season stubs but ChildCount=0. A follow-up `/Items/$SERIES_ID/Refresh?MetadataRefreshMode=FullRefresh&Recursive=true` (HTTP 204) was required to actually pull the 5 episodes into the index. This was *not* required in the benn-jordan run — possibly because Lex's filenames include forbidden-looking characters (`,` `&` `-`) and Jellyfin's series-stub-first heuristic is slower to reconcile when the discovery probe is racing the scan. Documented here so the next operator knows the second-pass refresh is sometimes load-bearing.
- [x] Per-series query `/Shows/$SERIES_ID/Episodes?Season=1` returns 5 episodes with correct durations
- [x] No `/Items/Counts` reliance — used `/Shows/<id>/Episodes` as authoritative
- [n/a] `ProviderIds` populated — **expected empty**, Podcasts library has internet providers OFF
- [x] `ImageTags.Primary` populated on all 5 — Jellyfin extracted thumbnail from mp4 itself
### Scan task
- **Task ID:** `7738148ffcd07979c7ceb148e06b3aed`
- **POST result:** HTTP 204
- **StartTime (initial scan):** `2026-05-11T14:43:01.031Z`
- **EndTime (initial scan):** `2026-05-11T14:43:12.285Z` (11 s)
- **Follow-up series refresh:** POST `/Items/$SERIES_ID/Refresh` returned HTTP 204; episodes appeared in season within ~3 s.
- **State after run:** `Idle`
## Notes / surprises
- Stock Jellyfin's Podcasts library is configured `tvshows`-type with
`EnableInternetProviders=false`. This matches the Educational library set
up for Benn Jordan — same pattern, different folder. **Do not try to
TMDb-identify Lex Fridman Podcast episodes; the Podcasts library is
deliberately offline.**
- Used podcast episode number as the season-1 episode index. E400/E461/E478/E479/E481
is consciously sparse — Jellyfin handles non-contiguous episode numbers fine,
and using the canonical podcast number means there's no ambiguity when an
operator looks at the UI and matches "Lex #481" to a file.
- All 5 downloads ran in parallel from onyx via a wrapper script
(`/tmp/lex-download.sh`) which `wait`-blocked on every job PID before
exiting. The wrapper's exit code (0) gated the rsync step — addressing the
"rsync raced partial downloads" failure mode from a prior YouTube import.
- E478 is 10.4 hours (`Scott Horton - The Case Against War and the Military
Industrial Complex`). Capped at 1080p it weighs in at 1.7 GB / ~590 kb/s
total. At 4K it would have exceeded 50 GB and absolutely buried disk.
The format selector `bv*[height<=1080]` is now the standing rule for any
podcast-style long-form import.
- rsync ran at ~61 MB/s onyx → nullstone over the 1G LAN (4.7 GB in ~80 s).
No `--info=progress2` surprises; resumable on disconnect via rsync defaults.
- Source staging dir on onyx (`/home/admin/staging-jelly/Lex Fridman Podcast/`)
is intentionally left in place — do not delete until owner confirms
playback.
## Operator action
1. Open `https://tv.s8n.ru` → Podcasts library → confirm "Lex Fridman Podcast"
series shows 5 episodes (numbered 400 / 461 / 478 / 479 / 481).
2. Play any episode → confirm direct-play on a modern AV1-capable client (no
transcode line in `docker logs jellyfin-stock`). On older clients expect
CPU transcode.
3. Optional: upload custom series poster + per-episode artwork via the
Jellyfin web UI (no TMDb fallback, so artwork is manual or absent).
4. Source dir on onyx retained per cleanup policy.