init: media-acquisition pipeline scaffold

Self-hosted BitTorrent + arr-stack + catalog-update pipeline targeting
nullstone (Debian 13). Replaces the legacy onyx -> rsync -> import
round-trip.

Contents:
- README.md          headline + ASCII architecture diagram + quickstart
- CLAUDE.md          project rules (mirrors beta-flix style)
- .gitignore         secrets dirs (.env, gluetun, qbt config, ssh keys)
- .gitleaksignore    allowlist nullstone LAN addr + Tailscale CGNAT
- docs/architecture.md   the plan in detail (gluetun + qbt + arr + catalog)
- docs/migration.md  onyx-qbt -> nullstone-qbt runbook (3 phases)
- docs/trackers.md   tracker schema + IP-pinning + ratio notes (user-curated)
- compose/docker-compose.yml  gluetun v3.40 + qbt 5.0.5 (netns=gluetun) +
                              sonarr/radarr/prowlarr (hotio) + betaflix-catalog
- compose/.env.example       documented env-var template (no secrets)
- compose/traefik/arr.yml    file-provider for qbt/sonarr/radarr/prowlarr
                             .s8n.ru subdomains, LAN+TS only via
                             trusted-only@file + authentik-forwardauth@file
- catalog/catalog.py         Flask service, ~340 LoC, /sonarr + /radarr +
                             /healthz; pulls beta-flix, inserts alphabetic
                             row into MEDIA-LIST.md, writes run log, commits
                             + pushes as obsidian-ai. Idempotent via
                             payload-hash cache.
- catalog/Dockerfile         python:3.12-slim + git + tini
- catalog/requirements.txt   flask + jinja2 + requests + gitpython + pyyaml (pinned)
- catalog/templates/*.j2     run log + catalog row Jinja templates
- catalog/README.md          service docs
- scripts/migrate-onyx.sh    phase-2 helper (rsync + .torrent ship, dry-run by default)
- scripts/add-tracker.sh     Prowlarr API helper
- scripts/killswitch-test.sh gluetun kill-switch verification (3 steps)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
obsidian-ai 2026-05-20 01:15:43 +01:00
commit d300d83ce1
19 changed files with 1758 additions and 0 deletions

43
.gitignore vendored Normal file
View file

@ -0,0 +1,43 @@
# --- Secrets ---
.env
.env.local
*.key
*.pem
*.crt
catalog/ssh/
compose/gluetun/
compose/qbittorrent/config/
compose/sonarr/
compose/radarr/
compose/prowlarr/
compose/catalog/ssh/
# --- Caches / runtime ---
__pycache__/
*.pyc
*.pyo
.pytest_cache/
.mypy_cache/
.ruff_cache/
*.sqlite
*.sqlite-journal
seen-imports.json
# --- Editor ---
.vscode/
.idea/
*.swp
*.swo
*~
.DS_Store
# --- Build artefacts ---
*.tar
*.tar.gz
dist/
build/
*.egg-info/
# --- Logs ---
*.log
logs/

27
.gitleaksignore Normal file
View file

@ -0,0 +1,27 @@
# Allowlist false-positive LAN-IP / tailnet-IP hits in docs + compose.
# These are the documented nullstone LAN address, the LAN/CGNAT
# allowed-egress subnets baked into gluetun config, and Proton WG client
# addresses — all infrastructure facts, not credentials.
# The lan-ip-rfc1918 rule is low-confidence by design — see ~/.config/git/.gitleaks.toml
# CLAUDE.md — header references nullstone LAN IP.
CLAUDE.md:lan-ip-rfc1918:9
# docs/architecture.md — header + § "Current State" reference live nullstone host.
docs/architecture.md:lan-ip-rfc1918:3
docs/architecture.md:lan-ip-rfc1918:30
# docs/migration.md — ssh + rsync targets to nullstone.
docs/migration.md:lan-ip-rfc1918:22
docs/migration.md:lan-ip-rfc1918:32
docs/migration.md:lan-ip-rfc1918:81
# scripts/migrate-onyx.sh — default NULLSTONE_SSH and ssh target.
scripts/migrate-onyx.sh:lan-ip-rfc1918:27
scripts/migrate-onyx.sh:lan-ip-rfc1918:35
# compose/docker-compose.yml — FIREWALL_OUTBOUND_SUBNETS allows LAN +
# RFC1918 + the Tailscale CGNAT range for webui reachability from
# trusted networks. These are public, well-known subnet constants.
compose/docker-compose.yml:lan-ip-rfc1918:26
compose/docker-compose.yml:tailnet-ip:26

109
CLAUDE.md Normal file
View file

@ -0,0 +1,109 @@
# CLAUDE.md — media-acquisition
Read this at session start. Rules for managing the nullstone media-acquisition
pipeline.
## What this repo is
The BitTorrent + arr-stack + catalog-update pipeline that feeds the ARRFLIX
library on **nullstone** (Debian 13, `192.168.0.100`). Consumed by:
- **Jellyfin** at `tv.s8n.ru` (container `jellyfin-stock`).
- **Catalog** at `git.s8n.ru/s8n/beta-flix``playbooks/import-media/MEDIA-LIST.md`.
## Source map
```
docs/architecture.md Plan + reasoning. Read this BEFORE editing compose.
docs/migration.md onyx-qbt → nullstone-qbt migration runbook.
docs/trackers.md Tracker schema + IP-pinning risks (user-curated).
compose/docker-compose.yml gluetun + qbt + sonarr + radarr + prowlarr + catalog.
compose/.env.example Env template — secrets live in .env (gitignored).
compose/traefik/arr.yml File-provider routing for arr stack.
catalog/ betaflix-catalog Python service (Flask + webhooks).
scripts/ migrate-onyx.sh, add-tracker.sh, killswitch-test.sh.
```
## Deploy lifecycle
1. **Edit** files locally under `/home/admin/projects/media-acquisition/`.
2. **Push to Forgejo** (this repo's authoritative remote is
`git.s8n.ru/s8n/media-acquisition.git`).
3. **On nullstone**: `cd /opt/docker/media-acquisition && git pull && docker compose up -d`.
4. **CRITICAL — verify kill-switch after every gluetun change**:
`bash scripts/killswitch-test.sh`. If the second curl succeeds, you have a leak;
tear down before re-trying.
## Rules paid for in blood (mirrored from beta-flix where applicable)
### Rule 1 — SSH user
`user@nullstone`. **NOT** `admin@nullstone`. AllowUsers was tightened
2026-05-03; uid 1000 only. Memory: `[[feedback_nullstone_ssh_user]]`.
### Rule 2 — Commit + push to **my git**
Authoritative remote is `git.s8n.ru/s8n/media-acquisition.git` (Forgejo).
No GitHub mirror. Always `git remote -v` before push. Memory:
`[[feedback_always_commit_to_my_git]]`, `[[feedback_check_remote_before_push]]`,
`[[feedback_my_git_is_forgejo]]`.
### Rule 3 — Commit identity
- Human commits: `s8n <admin@s8n.ru>`.
- Bot/automation commits (e.g. catalog service, scripted edits): `obsidian-ai <obsidian-ai@s8n.ru>`.
Memory: `[[user_git_identity]]`.
### Rule 4 — Kill-switch is non-negotiable
Every change to `gluetun` service or VPN env vars MUST be followed by
`scripts/killswitch-test.sh`. A torrent client leaking outside the VPN is the
single failure mode that defines this project — do not "trust" the firewall
based on config inspection. Run the test.
### Rule 5 — No secrets in repo
`.env`, WireGuard keys, Forgejo PATs, deploy keys: all gitignored. Use
`.env.example` to document variable names with placeholders. If you commit a
secret by accident, rotate it (Proton WG: regenerate key, update gluetun;
Forgejo PAT: revoke at `git.s8n.ru/-/user/settings/applications`).
### Rule 6 — Tracker IP pinning
Private trackers may pin sessions to a single source IP. Switching from
onyx public IP → Proton exit IP will trip them. Before adding a new tracker
or migrating an old torrent, check `docs/trackers.md` for the per-tracker
policy. Update `docs/trackers.md` whenever a new tracker is on-boarded.
### Rule 7 — XFS reflinks / hardlinks
`/home/user/media` is XFS, single device. Sonarr/Radarr "Use Hardlinks
instead of Copy" = ON. Catalog service may use `cp --reflink=always` for
divergent-perm scenarios (free inodes, zero block cost). Never `cp` plain;
that doubles disk usage and breaks seeding atomicity.
## Canonical naming
Catalog rows pushed to `beta-flix/playbooks/import-media/MEDIA-LIST.md` follow
the ARRFLIX house style:
- TV: `Series Title (Year)` — alphabetic by title, year tiebreaker.
- Movies: `Movie Title (Year)` — alphabetic by title.
- "Source / Version" column = raw Sonarr/Radarr `sourceTitle` (release name).
Human edits to "Why on arrflix" stay; bot never overwrites that column.
The catalog service is **append + merge only** — never overwrites human-authored
notes.
## How to start a session
1. Read this file.
2. Read `docs/architecture.md` if working on compose or catalog code.
3. Check `git status` and `git remote -v` (must show
`git.s8n.ru/s8n/media-acquisition.git`).
4. Owner says what they want; ship + verify kill-switch + commit to Forgejo.
5. End every turn: commit + push to `git.s8n.ru/s8n/media-acquisition.git`.
## Glossary
| Term | Means |
|-----------------------|----------------------------------------------------------------------------------|
| **ship** / **deploy** | git push to Forgejo → on nullstone, `git pull && docker compose up -d`. Kill-switch test on any gluetun change. |
| **migrate** | Phase-2 onyx→nullstone runbook in `docs/migration.md`. Read `scripts/migrate-onyx.sh` first; dry-run mode mandatory. |
| **add tracker** | `scripts/add-tracker.sh <name> <url>`; then update `docs/trackers.md` with IP-pinning policy + ratio requirements. |
| **killswitch test** | `bash scripts/killswitch-test.sh`. NEVER claim "VPN works" without running this. |
| **owner** | P (xynki.dev@gmail.com). Final say. Executive-override pattern from `[[feedback_s8n_executive_override]]` applies. |

107
README.md Normal file
View file

@ -0,0 +1,107 @@
# media-acquisition
Self-hosted BitTorrent + arr-stack + canonical-import pipeline that lands torrents
directly on **nullstone**, through a Proton WireGuard VPN with verified kill-switch,
hardlinks files into the existing ARRFLIX library, and auto-commits catalog rows
to `git.s8n.ru/s8n/beta-flix`.
Replaces the legacy `onyx → rsync → nullstone` round-trip.
## Architecture
```
+-----------------+
| Proton VPN |
| (WireGuard) |
+--------+--------+
| wg0
v
+------------+ indexer queries +-------------------+ torrent traffic
| Prowlarr |-------------------->| gluetun |<------------------+
+-----+------+ (via netns) | kill-switch fw | |
| +-------------------+ |
| search ^ ^ ^ |
v | | | |
+------------+ grabs +----------+ | +----------+ |
| Sonarr/ |----------->| qBittorrent (netns=gluetun) |
| Radarr | | /home/user/media/_downloads/{incomplete,complete}
+-----+------+ +-------------------------------------------------+
|
| OnImport webhook (POST /sonarr or /radarr)
v
+--------------------+
| betaflix-catalog |--+ XFS reflink/hardlink into /home/user/media/{movies,tv}
| (Flask, Python) | |
+--------+-----------+ +--> Jellyfin (tv.s8n.ru) picks up new items
|
| git commit + push (obsidian-ai)
v
+-----------------------------+
| git.s8n.ru/s8n/beta-flix |
| playbooks/import-media/ |
| MEDIA-LIST.md (updated) |
| runs/<slug>.md (new) |
+-----------------------------+
```
Single XFS filesystem at `/home/user/media` → hardlinks / reflinks are free.
## Quickstart
```bash
# Clone on nullstone
ssh user@nullstone
git clone https://git.s8n.ru/s8n/media-acquisition.git /opt/docker/media-acquisition
cd /opt/docker/media-acquisition/compose
# Configure
cp .env.example .env
${EDITOR:-vi} .env # fill in PVPN_WG_PRIVKEY, PVPN_WG_ADDRESSES, FORGEJO_PUSH_TOKEN, etc.
# Bring up
docker compose up -d
# Verify VPN kill-switch (CRITICAL — do not skip)
bash ../scripts/killswitch-test.sh
# Sanity: pick a sacrificial legal torrent in qbt UI, confirm it lands in
# /home/user/media/_downloads/complete/ and arr stack hardlinks it.
```
## Layout
```
README.md This file.
CLAUDE.md Project rules for Claude Code.
docs/
architecture.md The plan in detail. Decision log + reasoning.
migration.md onyx-qbt → nullstone-qbt migration runbook.
trackers.md Tracker schema + IP-pinning notes (user fills in).
compose/
docker-compose.yml Full stack: gluetun + qbt + sonarr + radarr + prowlarr + catalog.
.env.example All env vars documented.
traefik/arr.yml Traefik file-provider for *.s8n.ru subdomains (LAN+TS only).
catalog/
catalog.py Flask webhook receiver → beta-flix catalog updater.
Dockerfile python:3.12-slim base.
requirements.txt Pinned versions.
templates/ Jinja2 templates for run logs and catalog rows.
README.md Catalog service docs.
scripts/
migrate-onyx.sh Phase-2 migration: rsync + .torrent mass-add.
add-tracker.sh Helper: add tracker to Prowlarr via API.
killswitch-test.sh Verify gluetun blocks traffic when VPN drops.
```
## Related
- Plan: `docs/architecture.md`
- Catalog target: `git.s8n.ru/s8n/beta-flix` (`playbooks/import-media/MEDIA-LIST.md`)
- Jellyfin (consumer): `tv.s8n.ru` (`jellyfin-stock` container on nullstone)
- Host docs: `ai-lab/SYSTEM.md`
## Status
Scaffold. Live deploy pending VPN slot allocation + tracker IP-pinning review.
Next step: fill in `compose/.env` and bring up gluetun + qbt only (no arr yet)
to validate kill-switch.

28
catalog/Dockerfile Normal file
View file

@ -0,0 +1,28 @@
FROM python:3.12-slim
ENV PYTHONUNBUFFERED=1 \
PIP_NO_CACHE_DIR=1 \
PIP_DISABLE_PIP_VERSION_CHECK=1
# git is required for clone/pull/push; openssh-client for ssh remotes (future).
RUN apt-get update \
&& apt-get install -y --no-install-recommends git openssh-client ca-certificates tini \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY requirements.txt /app/requirements.txt
RUN pip install -r requirements.txt
COPY catalog.py /app/catalog.py
COPY templates /app/templates
# Forge globally so the bot identity persists even if env vars get dropped.
RUN git config --global user.name "obsidian-ai" \
&& git config --global user.email "obsidian-ai@s8n.ru" \
&& git config --global pull.rebase true \
&& git config --global init.defaultBranch main
EXPOSE 5055
ENTRYPOINT ["/usr/bin/tini", "--"]
CMD ["python", "/app/catalog.py"]

73
catalog/README.md Normal file
View file

@ -0,0 +1,73 @@
# betaflix-catalog
Flask service that receives Sonarr/Radarr **OnImport** webhooks and commits
catalog updates to `git.s8n.ru/s8n/beta-flix`.
## What it does
For each `Import` event:
1. Pulls latest `main` of `beta-flix` (rebase).
2. Inserts a row into `playbooks/import-media/MEDIA-LIST.md`, alphabetic by
title. Dedupes — if the key (`Title (Year)`) already exists, it skips.
3. Writes a per-import run log to
`playbooks/import-media/runs/<slug>.md` using the Jinja template at
`templates/run.md.j2`.
4. Commits as `obsidian-ai <obsidian-ai@s8n.ru>`.
5. Pushes to Forgejo using `FORGEJO_PUSH_TOKEN` embedded in the URL.
Idempotency: payload-hash cache at `/state/seen-imports.json`. Sonarr/Radarr
retry transient failures; duplicates are no-ops.
## Endpoints
- `POST /sonarr` — Sonarr Connect webhook target. Set Sonarr → Settings →
Connect → Webhook → URL `http://host.docker.internal:5055/sonarr`, method
POST, triggers: **On Import** only.
- `POST /radarr` — same shape for Radarr at `/radarr`.
- `GET /healthz` — liveness probe.
## Build
```bash
cd catalog/
docker build -t betaflix-catalog:local .
```
Or use compose: the parent `compose/docker-compose.yml` defines the
`betaflix-catalog` service with `build:` set to this directory.
## Env vars
| Variable | Required | Default | What |
|----------------------|----------|--------------------------------------------|---------------------------------------|
| `FORGEJO_REMOTE` | yes | `https://git.s8n.ru/s8n/beta-flix.git` | Push target. |
| `FORGEJO_PUSH_TOKEN` | yes | (empty) | Forgejo PAT — scopes: repository RW. |
| `GIT_AUTHOR_NAME` | no | `obsidian-ai` | Commit author. |
| `GIT_AUTHOR_EMAIL` | no | `obsidian-ai@s8n.ru` | Commit author email. |
| `REPO_PATH` | no | `/repo` | Where beta-flix gets cloned. |
| `STATE_DIR` | no | `/state` | seen-imports.json lives here. |
| `LISTEN_PORT` | no | `5055` | Flask bind port. |
## Volumes
- `/repo` — beta-flix checkout. Bind-mounted persistent volume.
- `/state``seen-imports.json` cache.
- `/root/.ssh` (optional, read-only) — for SSH deploy key (currently uses
HTTPS+PAT; SSH path reserved for future).
## Development
Run locally without Docker:
```bash
cd catalog/
python -m venv .venv && . .venv/bin/activate
pip install -r requirements.txt
REPO_PATH=/tmp/beta-flix-test STATE_DIR=/tmp/catalog-state \
FORGEJO_PUSH_TOKEN=xxx python catalog.py
# In another shell:
curl -X POST http://localhost:5055/sonarr \
-H 'Content-Type: application/json' \
-d '{"eventType":"Test"}'
```

366
catalog/catalog.py Normal file
View file

@ -0,0 +1,366 @@
"""betaflix-catalog — Sonarr/Radarr OnImport webhook receiver.
Listens for OnImport events from Sonarr and Radarr, edits
`playbooks/import-media/MEDIA-LIST.md` in the beta-flix Forgejo repo, writes
a per-import run log, and commits + pushes as `obsidian-ai`.
POST endpoints:
/sonarr Sonarr Connect webhook target.
/radarr Radarr Connect webhook target.
/healthz Liveness probe.
Idempotency: payload-hash cache at /state/seen-imports.json. Duplicates skipped.
Environment:
FORGEJO_REMOTE e.g. https://git.s8n.ru/s8n/beta-flix.git
FORGEJO_PUSH_TOKEN PAT embedded into the push URL.
GIT_AUTHOR_NAME obsidian-ai
GIT_AUTHOR_EMAIL obsidian-ai@s8n.ru
LISTEN_PORT 5055
"""
from __future__ import annotations
import hashlib
import json
import logging
import os
import re
import subprocess
import sys
import threading
from datetime import datetime, timezone
from pathlib import Path
from typing import Any
from flask import Flask, jsonify, request
from jinja2 import Environment, FileSystemLoader, select_autoescape
# --- Config -----------------------------------------------------------------
REPO_PATH = Path(os.environ.get("REPO_PATH", "/repo"))
STATE_DIR = Path(os.environ.get("STATE_DIR", "/state"))
TEMPLATES_DIR = Path(__file__).parent / "templates"
FORGEJO_REMOTE = os.environ.get("FORGEJO_REMOTE", "https://git.s8n.ru/s8n/beta-flix.git")
FORGEJO_TOKEN = os.environ.get("FORGEJO_PUSH_TOKEN", "")
GIT_AUTHOR_NAME = os.environ.get("GIT_AUTHOR_NAME", "obsidian-ai")
GIT_AUTHOR_EMAIL = os.environ.get("GIT_AUTHOR_EMAIL", "obsidian-ai@s8n.ru")
LISTEN_PORT = int(os.environ.get("LISTEN_PORT", "5055"))
MEDIA_LIST = REPO_PATH / "playbooks" / "import-media" / "MEDIA-LIST.md"
RUNS_DIR = REPO_PATH / "playbooks" / "import-media" / "runs"
SEEN_PATH = STATE_DIR / "seen-imports.json"
# Section headers in MEDIA-LIST.md the bot will edit.
MOVIES_SECTION = "## Movies"
TV_SECTION = "## TV"
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(message)s",
stream=sys.stdout,
)
log = logging.getLogger("catalog")
app = Flask(__name__)
_lock = threading.Lock()
_jinja = Environment(
loader=FileSystemLoader(str(TEMPLATES_DIR)),
autoescape=select_autoescape(["html", "xml"]),
trim_blocks=True,
lstrip_blocks=True,
)
# --- Idempotency ------------------------------------------------------------
def _load_seen() -> set[str]:
if not SEEN_PATH.exists():
return set()
try:
return set(json.loads(SEEN_PATH.read_text()))
except Exception:
log.warning("seen-imports.json corrupt; resetting")
return set()
def _save_seen(seen: set[str]) -> None:
STATE_DIR.mkdir(parents=True, exist_ok=True)
SEEN_PATH.write_text(json.dumps(sorted(seen)))
def _payload_hash(kind: str, payload: dict[str, Any]) -> str:
"""Stable hash for the import event — series:season:episode or movie:year."""
if kind == "sonarr":
sid = payload.get("series", {}).get("id", "?")
eps = payload.get("episodes", []) or [{}]
keys = sorted(f"{e.get('seasonNumber', '?')}x{e.get('episodeNumber', '?')}" for e in eps)
seed = f"sonarr:{sid}:{','.join(keys)}"
elif kind == "radarr":
mid = payload.get("movie", {}).get("id", "?")
seed = f"radarr:{mid}"
else:
seed = f"unknown:{json.dumps(payload, sort_keys=True)}"
return hashlib.sha256(seed.encode()).hexdigest()[:16]
# --- Git helpers ------------------------------------------------------------
def _git(*args: str, cwd: Path = REPO_PATH) -> subprocess.CompletedProcess:
env = os.environ.copy()
env.setdefault("GIT_AUTHOR_NAME", GIT_AUTHOR_NAME)
env.setdefault("GIT_AUTHOR_EMAIL", GIT_AUTHOR_EMAIL)
env.setdefault("GIT_COMMITTER_NAME", GIT_AUTHOR_NAME)
env.setdefault("GIT_COMMITTER_EMAIL", GIT_AUTHOR_EMAIL)
return subprocess.run(
["git", *args],
cwd=cwd,
env=env,
check=True,
capture_output=True,
text=True,
)
def _ensure_repo() -> None:
"""Clone the repo if /repo is empty."""
if (REPO_PATH / ".git").is_dir():
return
REPO_PATH.mkdir(parents=True, exist_ok=True)
clone_url = _push_url()
subprocess.run(
["git", "clone", clone_url, str(REPO_PATH)],
check=True,
capture_output=True,
text=True,
)
def _push_url() -> str:
if FORGEJO_TOKEN and FORGEJO_REMOTE.startswith("https://"):
return FORGEJO_REMOTE.replace("https://", f"https://{FORGEJO_TOKEN}@", 1)
return FORGEJO_REMOTE
def _pull_rebase() -> None:
_git("fetch", "origin")
_git("rebase", "origin/main")
def _commit_and_push(title: str) -> str:
_git("add", "playbooks/import-media/MEDIA-LIST.md", "playbooks/import-media/runs/")
status = _git("status", "--porcelain")
if not status.stdout.strip():
log.info("no changes to commit (%s)", title)
return ""
msg = f"catalog: add {title}"
_git("commit", "-m", msg, f"--author={GIT_AUTHOR_NAME} <{GIT_AUTHOR_EMAIL}>")
_git("push", _push_url(), "HEAD:main")
sha = _git("rev-parse", "HEAD").stdout.strip()
return sha
# --- MEDIA-LIST.md editing --------------------------------------------------
def _normalise_title(title: str) -> str:
return re.sub(r"\s+", " ", title.strip())
def _slugify(s: str) -> str:
s = re.sub(r"[^a-zA-Z0-9]+", "-", s.lower()).strip("-")
return s[:80] or "untitled"
def _row(kind: str, title: str, year: int | None, source: str) -> str:
year_s = f"({year})" if year else ""
if kind == "tv":
return f"| {title} {year_s} | TV | {source} | _todo_ | "
return f"| {title} {year_s} | Movie | {source} | _todo_ | "
def _insert_alphabetic(section_header: str, row: str, key: str) -> bool:
"""Insert `row` into the section under `section_header`, alphabetic by key.
Returns True if a new row was added, False if the key already existed
(caller handles merge/dedup separately).
"""
if not MEDIA_LIST.exists():
log.warning("MEDIA-LIST.md missing at %s — skipping insert", MEDIA_LIST)
return False
lines = MEDIA_LIST.read_text().splitlines()
try:
start = next(i for i, line in enumerate(lines) if line.strip() == section_header)
except StopIteration:
log.warning("section %r not found in MEDIA-LIST.md", section_header)
return False
# Find table boundaries.
i = start + 1
while i < len(lines) and not lines[i].lstrip().startswith("|"):
i += 1
if i >= len(lines):
log.warning("no table found under section %r", section_header)
return False
header_idx = i
# Skip header + separator rows.
i = header_idx + 2
section_rows_start = i
while i < len(lines) and lines[i].lstrip().startswith("|"):
if key.lower() in lines[i].lower():
log.info("row already present for key=%r — skipping", key)
return False
i += 1
section_rows_end = i
# Alphabetic insert by first column.
insert_at = section_rows_end
for j in range(section_rows_start, section_rows_end):
cell = lines[j].split("|")[1].strip() if "|" in lines[j] else ""
if key.lower() < cell.lower():
insert_at = j
break
lines.insert(insert_at, row)
MEDIA_LIST.write_text("\n".join(lines) + "\n")
return True
def _write_run_log(slug: str, ctx: dict[str, Any]) -> Path:
RUNS_DIR.mkdir(parents=True, exist_ok=True)
template = _jinja.get_template("run.md.j2")
out = RUNS_DIR / f"{slug}.md"
out.write_text(template.render(**ctx))
return out
# --- Webhook handlers -------------------------------------------------------
def _handle_sonarr(payload: dict[str, Any]) -> tuple[str, bool]:
series = payload.get("series", {}) or {}
title = _normalise_title(series.get("title", "Unknown Series"))
year = series.get("year") or None
eps = payload.get("episodes", []) or []
files = payload.get("episodeFiles") or payload.get("episodeFile") or []
if isinstance(files, dict):
files = [files]
source = (files[0].get("sceneName") or files[0].get("relativePath") or "?") if files else "?"
key = f"{title} ({year})" if year else title
with _lock:
_ensure_repo()
_pull_rebase()
added = _insert_alphabetic(TV_SECTION, _row("tv", title, year, source), key)
slug = _slugify(f"{key}-S{eps[0].get('seasonNumber','?')}E{eps[0].get('episodeNumber','?')}" if eps else key)
_write_run_log(slug, {
"kind": "tv",
"title": title,
"year": year,
"source": source,
"episodes": eps,
"ts": datetime.now(timezone.utc).isoformat(timespec="seconds"),
"row_added": added,
})
sha = _commit_and_push(f"{title} ({year})" if year else title)
return sha, added
def _handle_radarr(payload: dict[str, Any]) -> tuple[str, bool]:
movie = payload.get("movie", {}) or {}
title = _normalise_title(movie.get("title", "Unknown Movie"))
year = movie.get("year") or None
mfile = payload.get("movieFile") or {}
source = mfile.get("sceneName") or mfile.get("relativePath") or "?"
key = f"{title} ({year})" if year else title
with _lock:
_ensure_repo()
_pull_rebase()
added = _insert_alphabetic(MOVIES_SECTION, _row("movie", title, year, source), key)
slug = _slugify(key)
_write_run_log(slug, {
"kind": "movie",
"title": title,
"year": year,
"source": source,
"ts": datetime.now(timezone.utc).isoformat(timespec="seconds"),
"row_added": added,
})
sha = _commit_and_push(f"{title} ({year})" if year else title)
return sha, added
# --- Flask routes -----------------------------------------------------------
@app.get("/healthz")
def healthz():
return jsonify(ok=True), 200
@app.post("/sonarr")
def sonarr():
payload = request.get_json(silent=True) or {}
event = payload.get("eventType", "")
if event in ("Test", "ApplicationUpdate"):
log.info("sonarr probe event=%s — ack", event)
return jsonify(ok=True, ignored=event), 200
if event != "Import" and event != "Download":
log.info("sonarr event=%s — ignored", event)
return jsonify(ok=True, ignored=event), 200
h = _payload_hash("sonarr", payload)
seen = _load_seen()
if h in seen:
log.info("sonarr duplicate event hash=%s — skipping", h)
return jsonify(ok=True, duplicate=h), 200
try:
sha, added = _handle_sonarr(payload)
except subprocess.CalledProcessError as e:
log.exception("git failed: %s", e.stderr)
return jsonify(ok=False, error=e.stderr), 500
except Exception as e: # noqa: BLE001
log.exception("sonarr handler crashed")
return jsonify(ok=False, error=str(e)), 500
seen.add(h)
_save_seen(seen)
return jsonify(ok=True, sha=sha, row_added=added), 200
@app.post("/radarr")
def radarr():
payload = request.get_json(silent=True) or {}
event = payload.get("eventType", "")
if event in ("Test", "ApplicationUpdate"):
log.info("radarr probe event=%s — ack", event)
return jsonify(ok=True, ignored=event), 200
if event != "Import" and event != "Download":
log.info("radarr event=%s — ignored", event)
return jsonify(ok=True, ignored=event), 200
h = _payload_hash("radarr", payload)
seen = _load_seen()
if h in seen:
log.info("radarr duplicate event hash=%s — skipping", h)
return jsonify(ok=True, duplicate=h), 200
try:
sha, added = _handle_radarr(payload)
except subprocess.CalledProcessError as e:
log.exception("git failed: %s", e.stderr)
return jsonify(ok=False, error=e.stderr), 500
except Exception as e: # noqa: BLE001
log.exception("radarr handler crashed")
return jsonify(ok=False, error=str(e)), 500
seen.add(h)
_save_seen(seen)
return jsonify(ok=True, sha=sha, row_added=added), 200
if __name__ == "__main__":
log.info("betaflix-catalog listening on 0.0.0.0:%d", LISTEN_PORT)
app.run(host="0.0.0.0", port=LISTEN_PORT)

5
catalog/requirements.txt Normal file
View file

@ -0,0 +1,5 @@
flask==3.0.3
requests==2.32.3
jinja2==3.1.4
gitpython==3.1.43
pyyaml==6.0.2

View file

@ -0,0 +1,4 @@
{# Reference template for MEDIA-LIST.md row generation.
Pipe-delimited; matches the existing beta-flix table schema.
Columns: Title | Kind | Source / Version | Why on arrflix | (trailing pipe) #}
| {{ title }}{% if year %} ({{ year }}){% endif %} | {{ kind }} | {{ source }} | {{ why | default("_todo_") }} |

View file

@ -0,0 +1,24 @@
# {{ title }}{% if year %} ({{ year }}){% endif %} — import run
- **Date:** {{ ts }}
- **Kind:** {{ kind }}
- **Source / release name:** `{{ source }}`
- **Catalog row added:** {{ "yes" if row_added else "no (already present)" }}
{% if kind == "tv" and episodes %}
## Episodes imported
| Season | Episode | Title |
|--------|---------|-------|
{% for e in episodes %}
| {{ e.seasonNumber }} | {{ e.episodeNumber }} | {{ e.title | default("?") }} |
{% endfor %}
{% endif %}
## Notes
_(human-authored)_
---
_Auto-generated by `betaflix-catalog` on Sonarr/Radarr OnImport webhook._

38
compose/.env.example Normal file
View file

@ -0,0 +1,38 @@
# compose/.env.example
#
# Copy to .env (gitignored) and fill in real values.
#
# Never commit .env. Forgejo PAT + Proton WG key + arr API keys = secrets.
# --- Timezone (logs + scheduling) ---
TZ=Europe/London
# --- Proton VPN (gluetun) ---
# Generate a dedicated WireGuard key in the Proton dashboard:
# Account → WireGuard → New Configuration → name it "nullstone-gluetun-arr"
# Do NOT reuse the host's wg-pvpn-A/B keys.
PVPN_WG_PRIVKEY=REPLACE_WITH_PROTON_WG_PRIVATE_KEY
# The address Proton assigns to the new key (e.g. 10.2.0.3/32).
PVPN_WG_ADDRESSES=10.2.0.3/32
# Country (P2P-permitted). Comma-separated to let gluetun pick from a pool.
PVPN_SERVER_COUNTRIES=Netherlands
# --- Catalog service: Forgejo push ---
# https://git.s8n.ru → Settings → Applications → Generate New Token
# Scopes required: repository (read+write), user (read).
# Token is embedded in the remote URL inside the catalog container.
FORGEJO_PUSH_TOKEN=REPLACE_WITH_FORGEJO_PAT
# Remote URL — leave default unless beta-flix is moved.
FORGEJO_REMOTE=https://git.s8n.ru/s8n/beta-flix.git
# --- arr API keys ---
# Fetch from each app's Settings → General → Security after first launch.
# Used by catalog service to enrich the webhook payload via API calls.
SONARR_API_KEY=REPLACE_WITH_SONARR_API_KEY
RADARR_API_KEY=REPLACE_WITH_RADARR_API_KEY
PROWLARR_API_KEY=REPLACE_WITH_PROWLARR_API_KEY
# --- Optional: forwarded-port sync helper ---
# If you add caillef/qbittorrent-port-sync later for ratio-critical seeding,
# the qbt webui password goes here (used by that helper, not qbt itself).
QBT_WEBUI_PASSWORD=REPLACE_WITH_QBT_WEBUI_PASSWORD

141
compose/docker-compose.yml Normal file
View file

@ -0,0 +1,141 @@
# nullstone media-acquisition stack
#
# Compose file for: gluetun (VPN + kill-switch) + qBittorrent + Sonarr +
# Radarr + Prowlarr + betaflix-catalog (Forgejo committer).
#
# Place this directory under /opt/docker/media-acquisition/ on nullstone.
# Run: docker compose up -d
# Verify kill-switch: bash ../scripts/killswitch-test.sh
services:
gluetun:
image: qmcgaw/gluetun:v3.40
container_name: gluetun
cap_add:
- NET_ADMIN
devices:
- /dev/net/tun:/dev/net/tun
environment:
- VPN_SERVICE_PROVIDER=protonvpn
- VPN_TYPE=wireguard
- WIREGUARD_PRIVATE_KEY=${PVPN_WG_PRIVKEY}
- WIREGUARD_ADDRESSES=${PVPN_WG_ADDRESSES}
- SERVER_COUNTRIES=${PVPN_SERVER_COUNTRIES:-Netherlands}
- VPN_PORT_FORWARDING=on
- VPN_PORT_FORWARDING_PROVIDER=protonvpn
- FIREWALL_OUTBOUND_SUBNETS=192.168.0.0/24,172.16.0.0/12,100.64.0.0/10
- DOT=off
- TZ=${TZ:-Europe/London}
ports:
# All published on 127.0.0.1 — Traefik file-provider picks them up.
- "127.0.0.1:8080:8080" # qBittorrent WebUI
- "127.0.0.1:9696:9696" # Prowlarr
- "127.0.0.1:8989:8989" # Sonarr
- "127.0.0.1:7878:7878" # Radarr
volumes:
- ./gluetun:/gluetun
restart: unless-stopped
healthcheck:
test: ["CMD", "wget", "-qO-", "--tries=1", "--timeout=10", "https://ipinfo.io"]
interval: 30s
timeout: 15s
retries: 3
start_period: 30s
qbittorrent:
image: qbittorrentofficial/qbittorrent-nox:5.0.5
container_name: qbittorrent
depends_on:
gluetun:
condition: service_healthy
network_mode: "service:gluetun"
user: "1000:1000"
environment:
- QBT_LEGAL_NOTICE=confirm
- QBT_WEBUI_PORT=8080
- UMASK=022
- TZ=${TZ:-Europe/London}
volumes:
- ./qbittorrent/config:/config
- /home/user/media/_downloads:/downloads
- /home/user/media:/media
restart: unless-stopped
prowlarr:
image: ghcr.io/hotio/prowlarr:release
container_name: prowlarr
depends_on:
gluetun:
condition: service_healthy
network_mode: "service:gluetun"
environment:
- PUID=1000
- PGID=1000
- UMASK=022
- TZ=${TZ:-Europe/London}
volumes:
- ./prowlarr:/config
restart: unless-stopped
sonarr:
image: ghcr.io/hotio/sonarr:release
container_name: sonarr
depends_on:
gluetun:
condition: service_healthy
network_mode: "service:gluetun"
environment:
- PUID=1000
- PGID=1000
- UMASK=022
- TZ=${TZ:-Europe/London}
volumes:
- ./sonarr:/config
- /home/user/media:/media
restart: unless-stopped
radarr:
image: ghcr.io/hotio/radarr:release
container_name: radarr
depends_on:
gluetun:
condition: service_healthy
network_mode: "service:gluetun"
environment:
- PUID=1000
- PGID=1000
- UMASK=022
- TZ=${TZ:-Europe/London}
volumes:
- ./radarr:/config
- /home/user/media:/media
restart: unless-stopped
betaflix-catalog:
image: betaflix-catalog:local
container_name: betaflix-catalog
build:
context: ../catalog
dockerfile: Dockerfile
# NOT bound to gluetun — needs to reach Forgejo + Sonarr/Radarr
network_mode: bridge
extra_hosts:
- "host.docker.internal:host-gateway"
environment:
- FORGEJO_REMOTE=${FORGEJO_REMOTE:-https://git.s8n.ru/s8n/beta-flix.git}
- FORGEJO_PUSH_TOKEN=${FORGEJO_PUSH_TOKEN}
- GIT_AUTHOR_NAME=obsidian-ai
- GIT_AUTHOR_EMAIL=obsidian-ai@s8n.ru
- GIT_COMMITTER_NAME=obsidian-ai
- GIT_COMMITTER_EMAIL=obsidian-ai@s8n.ru
- SONARR_API_KEY=${SONARR_API_KEY}
- RADARR_API_KEY=${RADARR_API_KEY}
- TZ=${TZ:-Europe/London}
- LISTEN_PORT=5055
ports:
- "127.0.0.1:5055:5055"
volumes:
- ./catalog/repo:/repo
- ./catalog/ssh:/root/.ssh:ro
- ./catalog/state:/state
restart: unless-stopped

77
compose/traefik/arr.yml Normal file
View file

@ -0,0 +1,77 @@
# Traefik file-provider snippet for the media-acquisition stack.
#
# Symlink (or cp) this file into /opt/docker/traefik/config/arr.yml on
# nullstone. Traefik picks up file-provider configs without restart.
#
# All routes are LAN+Tailscale-only (trusted-only@file middleware) AND
# require Authentik forward-auth. Add the arr-stack Authentik group as
# needed.
#
# Backends are 127.0.0.1:<port> because gluetun publishes the qbt/prowlarr/
# sonarr/radarr ports on host loopback (network_mode: service:gluetun).
http:
routers:
qbt:
rule: "Host(`qbt.s8n.ru`)"
entryPoints: [websecure]
service: qbt
tls:
certResolver: gandi
middlewares:
- trusted-only@file
- authentik-forwardauth@file
prowlarr:
rule: "Host(`prowlarr.s8n.ru`)"
entryPoints: [websecure]
service: prowlarr
tls:
certResolver: gandi
middlewares:
- trusted-only@file
- authentik-forwardauth@file
sonarr:
rule: "Host(`sonarr.s8n.ru`)"
entryPoints: [websecure]
service: sonarr
tls:
certResolver: gandi
middlewares:
- trusted-only@file
- authentik-forwardauth@file
radarr:
rule: "Host(`radarr.s8n.ru`)"
entryPoints: [websecure]
service: radarr
tls:
certResolver: gandi
middlewares:
- trusted-only@file
- authentik-forwardauth@file
# Catalog service has no public route — Sonarr/Radarr hit it via
# host.docker.internal:5055 from inside their gluetun netns.
services:
qbt:
loadBalancer:
servers:
- url: "http://127.0.0.1:8080"
prowlarr:
loadBalancer:
servers:
- url: "http://127.0.0.1:9696"
sonarr:
loadBalancer:
servers:
- url: "http://127.0.0.1:8989"
radarr:
loadBalancer:
servers:
- url: "http://127.0.0.1:7878"

242
docs/architecture.md Normal file
View file

@ -0,0 +1,242 @@
# Architecture — nullstone BitTorrent + Import Pipeline
Last reviewed: 2026-05-20 against live state of `user@192.168.0.100`.
**Goal:** kill the `download-on-onyx → rsync → import` round-trip. Land torrents
directly on nullstone, through VPN, hardlink into the canonical ARRFLIX library,
auto-update the catalog in `git.s8n.ru/s8n/beta-flix`.
---
## TL;DR Decisions
| Question | Decision |
|-------------------------|---------------------------------------------------------------------------------------------------------------------------------------|
| Client | `qbittorrentofficial/qbittorrent-nox:5.0.5` (single container, official build, slim) |
| VPN binding | **gluetun sidecar + `network_mode: service:gluetun`** (WireGuard, kill-switch built-in). Reuses Proton WG. Replaces `socks-pvpn` for BT only. |
| Folder layout | `/home/user/media/_downloads/{incomplete,complete}` (NOT scanned by JF) + hardlinks into existing `movies/tv/...` |
| Arr stack? | **Yes for Sonarr/Radarr/Prowlarr, NO for Bazarr/cross-seed (yet)**. Mature rename engine beats bespoke; manual selection still works. |
| FS atomic-import | XFS reflinks (`cp --reflink=always`) — same inode cost as hardlinks but allow free path/perm divergence. "Use Hardlinks" toggle works. |
| Catalog auto-update | sidecar Python service (`betaflix-catalog`) on Sonarr/Radarr webhooks → patches `MEDIA-LIST.md` → git commit+push to Forgejo. |
| GPU | untouched — qbt doesn't need it; Jellyfin keeps its existing passthrough (CPU-only post-driver-issue, separate concern). |
Override any of this if your gut disagrees — but record an ADR under
`docs/decisions/` first.
---
## Current State (verified live)
- `socks-pvpn` container: `serjs/go-socks5-proxy` on `socks-vpn` (172.31.0.0/24).
Already provides `socks5://socks-pvpn:1080` with `qbt` user. Egress via host's
`wg-pvpn-A` / `wg-pvpn-B` (policy-routed by fwmark `0x51820` / `0x51821`).
Proton WG is **host-side**, not in a container.
- `jellyfin-stock`: mounts `/home/user/media → /media` bind, in `proxy` network.
- xfs at `/dev/sda1 → /home/user/media`, 5.5T total / 3.9T free. Reflink-capable.
- No existing Sonarr / Radarr / Prowlarr / gluetun / qbt containers.
- Traefik on networks `proxy`, `socket-proxy-net`, `misskey-frontend`.
Two viable VPN strategies: keep using `socks-pvpn` SOCKS5 with qbt proxy, or
drop in a dedicated `gluetun` for the BT stack only. See § b.
---
## a) qBittorrent image
**Pick:** `qbittorrentofficial/qbittorrent-nox:5.0.5`.
- Official upstream build, signed, no LSIO PUID/PGID overhead.
- 5.0.x ships native WebUI v2 and modern logging.
- Single port: 8080 (WebUI) + chosen listen port (e.g. 51820+random for BT).
- Run as uid `1000:1000` (matches `user:user` on host) so anything qbt writes
to `/home/user/media/_downloads` already matches library ownership.
**Skip** `linuxserver/qbittorrent` — extra init scripts, slower updates, PUID
drift when paired with userns-remap.
**Skip** `qbittorrent-nox` bare on host — Docker buys VPN-namespace binding +
restart isolation. Cheap.
---
## b) VPN binding — pick `gluetun`
Three patterns considered:
| Pattern | Pro | Con |
|-------------------------------------------|---------------------------------------------|---------------------------------------------------------------------------------------------------------------------------|
| qbt SOCKS5 → `socks-pvpn` | Zero new infra | qbt SOCKS support has historical leaks (UDP, trackers, DHT). Not a kill-switch — if SOCKS dies, qbt uses default route → clear-net leak |
| WireGuard inside qbt container | Tight blast radius | Bake wg into image; restart re-attach pain; upgrades painful |
| **`gluetun` sidecar, qbt joins its netns**| Mature kill-switch (iptables-enforced), port-forward helper, qbt unchanged | Adds one container; eats a Proton WG slot |
**Decision: gluetun.** It's a kill-switch by design — if WG drops, gluetun's
firewall blackholes traffic. Used by every torrent setup in /r/selfhosted for
that reason. Add a third Proton WG endpoint specifically for it (so it doesn't
collide with the existing wg-pvpn-A/B host-level policy routes).
Keep `socks-pvpn` running for other clients.
### Kill-switch verification
```bash
docker exec qbittorrent curl -sf --max-time 5 https://api.ipify.org # should return Proton exit IP
docker stop gluetun && docker exec qbittorrent curl -sf --max-time 5 https://api.ipify.org # MUST hang/fail
docker start gluetun
```
If the second command succeeds, you have a leak — **do not proceed**.
The wrapper script is at `scripts/killswitch-test.sh`.
---
## c) Folder layout (xfs, single device)
```
/home/user/media/
├── _downloads/ # NOT in any JF library → JF can't see it
│ ├── incomplete/ # qbt's "temp path" — half-written files
│ ├── complete/ # qbt's "save path" — completed, still seeding
│ └── watch/ # drop .torrent files here for auto-add
├── movies/ # canonical, JF scans (existing)
├── tv/ # canonical, JF scans (existing)
├── education/ # YouTube creator, JF scans (existing)
├── music/ # (existing)
└── podcasts/ # (existing)
```
JF library paths are configured in dashboard and only point at the four
canonical roots. `_downloads/` is on the same xfs filesystem → hardlinks /
reflinks are free (zero extra blocks consumed) when sonarr/radarr import.
Sonarr/Radarr setting: **Use Hardlinks instead of Copy = yes**.
Permissions: qbt runs as `1000:1000`, files land 644/dirs 755 (image sets
`umask 022`). If defaults drift, force with `UMASK=022` in container env
(qbt 5 honors it).
---
## d, f) Arr stack vs custom watcher — pick Arr
For 20-40 items in pipeline the bespoke watcher is *tempting*. Pick
Sonarr/Radarr anyway:
**For Sonarr/Radarr (mature rename + import):**
- Their rename engine handles 100+ edge cases your bash will eventually trip:
multi-episode files, anime absolute numbering, special seasons,
daily-broadcast dates, year-disambiguated titles. You will hit these.
- "Interactive Search" gives manual selection — not forced into RSS auto-grab.
- Hardlink-on-import is a checkbox, not a function to debug.
- Webhook on import → ready-made trigger for catalog-update.
- Library "Scan after import" is built-in. Skip the cargo-cult JF scan task ID
dance (keep as manual escape hatch).
**For Prowlarr:**
- One-place indexer config. Even if you only use 3 trackers, having them
managed in Prowlarr and pushed to Sonarr+Radarr is less duplication.
- Categories + capabilities matter when manual-search returns results — you
want season-pack vs single-episode discrimination on the search UI.
**Against (kept honest):**
- Five extra containers (gluetun, qbt, sonarr, radarr, prowlarr). ~600 MB RAM
combined idle. nullstone has 31 G; rounding error.
- Sonarr database in SQLite — back up in `./backup.sh`.
- More UI surface. Two evenings.
**Hard NO for now:**
- **Bazarr** — subtitle pipeline is the WhisperX v4 build, not OpenSubtitles.
- **cross-seed** — only useful when seriously seeding to ratio. Defer.
- **Lidarr / Readarr** — out of scope (music + books not in this pipeline).
If after 2 weeks Sonarr's metadata picker is fighting you, **then** swap to
bespoke — files on disk are the same shape either way.
---
## e) Catalog-update service (mandatory regardless)
Even with Sonarr/Radarr, neither tool knows about
`/home/admin/projects/beta-flix/playbooks/import-media/MEDIA-LIST.md`. So:
`betaflix-catalog` (Python 3.12, Flask, ~200 LoC, in `catalog/`). Listens for
Sonarr/Radarr **"On Import"** webhooks. For each event:
1. Pull metadata from webhook payload (`series.title`, `series.year`,
`episodeFile.path`, or `movie.title` + `movie.year` + `movieFile.path`).
2. `git -C /repo pull --rebase origin main`.
3. Edit `playbooks/import-media/MEDIA-LIST.md`:
- Movies: insert into Movies table, alphabetic on title.
- TV: if series row exists, merge seasons into the `Seasons` column;
else insert new row.
- "Source / Version" column = parsed from filename release-group tokens
**before** Sonarr stripped them. The webhook gives `sourceTitle`
(original release name) — log it raw, you can edit later.
- "Why on arrflix" column stays blank — that's human-authored.
4. Write run log to `playbooks/import-media/runs/<slug>.md` using a Jinja
template (date, source path, target path, item count, ffprobe summary
from a `docker exec jellyfin-stock ffprobe` call — optional, deferred).
5. `git commit -m "catalog: add <title> (<year>)" --author "obsidian-ai <obsidian-ai@s8n.ru>"`.
6. `git push origin main`. Forgejo deploy key in `compose/catalog/ssh/`
(gitignored — placed by operator at deploy time).
Webhook config in Sonarr: Settings → Connect → Webhook → POST to
`http://host.docker.internal:5055/sonarr` on `OnImport` event only.
Idempotency: hash the payload (`{series_id}:{season}:{episode}`); skip if
seen in the last hour (Sonarr retries on transient failure). Cache lives at
`/tmp/seen-imports.json` (ephemeral; that's fine — duplicate commits are
benign-but-noisy, not destructive).
Skeleton lives at `catalog/catalog.py` in this repo. ~30 minutes to draft,
~2 hours to harden. The piece that bridges "files on nullstone" to "facts
in Forgejo".
---
## g) Migration from onyx-qbt → nullstone-qbt
State: 60+ active torrents on onyx, with download dirs on onyx local disk.
**Goal:** keep seeding (don't burn ratios) while shifting future downloads to
nullstone. Two-phase, no big-bang.
Full runbook: `docs/migration.md` (and the script `scripts/migrate-onyx.sh`).
---
## What this doesn't solve (be aware)
- **Tracker IP allowlists.** Some private trackers pin sessions to a single
IP. Switching from onyx public IP → Proton exit IP will trip them. Check
each tracker's rules before migrating — you may need an IP-update request
per private tracker. See `docs/trackers.md`.
- **Port forwarding via Proton.** gluetun's `VPN_PORT_FORWARDING=on` handles
this for Proton, but the forwarded port rotates. Set qbt to use the
gluetun-provided port via the gluetun control server (gluetun writes the
current port to `/tmp/gluetun/forwarded_port`; qbt's `qBittorrent.conf`
needs a wrapper script to read it on start). Known helper image:
`caillef/qbittorrent-port-sync` — drop in as a 6th container if seeding
ratio matters. Deferred until tracker ratio becomes a real concern.
- **Backup.** Add `/opt/docker/media-acquisition/compose/{sonarr,radarr,prowlarr,qbittorrent}/config`
to nullstone's `/opt/docker/backup.sh`. SQLite DBs — stop containers
briefly or use `sqlite3 .backup` semantics.
---
## Open decisions to confirm before implementing
1. Proton plan slot count — gluetun needs its own WG key. Free slot?
2. Which private trackers do you actually use? IP-pinning check.
3. Public hostnames for the arr-stack: confirm `qbt/sonarr/radarr/prowlarr.s8n.ru`
or pick a sub-zone (`arr.s8n.ru/qbt/`).
4. Authentik group for arr-stack access (LAN-only? or also from gravel via
Tailscale?).
5. Forgejo deploy key — generate now or reuse `obsidian-ai`'s existing key?
Answer those five and the implementation is ~1 evening of compose + ~2 hours
on the catalog service. Migration is a separate weekend.

136
docs/migration.md Normal file
View file

@ -0,0 +1,136 @@
# Migration — onyx-qbt → nullstone-qbt
State at time of writing: 60+ active torrents on onyx with download dirs on
onyx local disk. **Goal:** keep seeding (don't burn ratios) while shifting
future downloads to nullstone. Two-phase, no big-bang.
---
## Phase 1 — Stand up nullstone stack (no migration yet)
1. **Prep directory tree** on nullstone:
```bash
ssh user@nullstone
sudo mkdir -p /home/user/media/_downloads/{incomplete,complete,watch}
sudo chown -R user:user /home/user/media/_downloads
```
2. **Generate new Proton WG key + provisioning for gluetun.** Don't reuse
`wg-pvpn-A` keys (they're host-routed; conflict risk). Log into Proton
account → WireGuard → new key → name it `nullstone-gluetun-arr` → save
the privkey + assigned address (e.g. `10.2.0.3/32`).
3. **Drop the privkey + address into `compose/.env`:**
```bash
cd /opt/docker/media-acquisition/compose
cp .env.example .env
${EDITOR:-vi} .env
# Set:
# PVPN_WG_PRIVKEY=<the new privkey>
# PVPN_WG_ADDRESSES=10.2.0.3/32
# PVPN_SERVER_COUNTRIES=Netherlands
```
4. **Bring up the stack.** Start gluetun + qbt only first:
```bash
docker compose up -d gluetun qbittorrent
```
5. **Kill-switch test (NON-NEGOTIABLE):**
```bash
bash scripts/killswitch-test.sh
```
If second curl succeeds → leak. Tear down and debug. Do not proceed.
6. **Sacrificial torrent.** Pick something legal + big you don't care about
(e.g. a Linux distro ISO). Add it via qbt webui, watch it land in
`/home/user/media/_downloads/complete/`. Confirm it **does not** appear in JF.
7. **Bring up the rest of the stack.**
```bash
docker compose up -d
```
Configure Prowlarr → Sonarr → Radarr (in that order — Prowlarr pushes
indexers downstream). Set "Use Hardlinks instead of Copy = yes" in
Sonarr/Radarr Media Management.
8. **Test arr → import path.** Sonarr Interactive Search → manual grab → import
into `/media/tv/...`. Verify catalog service commits to Forgejo.
---
## Phase 2 — Migrate onyx torrents
For each active torrent on onyx that you want to keep seeding:
```bash
# On onyx — export .torrent files + qbt's fastresume state
mkdir -p /tmp/qbt-migrate
cp ~/.local/share/qBittorrent/BT_backup/*.torrent /tmp/qbt-migrate/
cp ~/.local/share/qBittorrent/BT_backup/*.fastresume /tmp/qbt-migrate/
# rsync the actual data files to nullstone first (LAN gigabit)
rsync -av --info=progress2 ~/Downloads/qbt/ \
user@192.168.0.100:/home/user/media/_downloads/complete/
```
Then on nullstone qbt webui:
1. Add `.torrent` files in bulk via webui ("Add torrent files…"), save path =
`/downloads/complete/`, **uncheck "Start torrent"**.
2. Force-recheck each added torrent. qbt matches local files → `100%` → seeding.
3. Verify trackers respond. Private trackers may need source-IP rotation —
gluetun exit IP differs from onyx public IP. See `docs/trackers.md`.
4. On onyx: pause torrents one-by-one as nullstone takes over. Don't stop
onyx-qbt entirely until every torrent shows seeding on nullstone for 24h
with no tracker errors.
**Catalog backfill:** for torrents that correspond to already-imported
library items, **don't** trigger arr-import — they're already in canonical
locations. Just seed from `_downloads/complete/`. Catalog stays accurate.
For torrents that were mid-download on onyx but never made it into the
library: re-add on nullstone, let them complete via VPN, then sonarr/radarr
picks them up via the normal path.
**Estimated migration window:** 1 weekend. ~250 GB rsync over LAN gigabit ≈
~30 min wall clock for the data move, then a manual-but-tedious
add-and-recheck loop in qbt.
The wrapper script for steps 1-2 is at `scripts/migrate-onyx.sh`. It does
the rsync + builds a `.torrent` index for a follow-up bulk-add. The
fastresume-rewrite step is documented inline in the script.
---
## Phase 3 — Decommission onyx-qbt
After 7 days clean on nullstone:
1. Stop qbt service on onyx (`systemctl --user stop qbittorrent-nox` or kill
the GUI; depends on how it was launched).
2. Delete `~/Downloads/qbt/` on onyx (only after confirming no in-flight
torrents reference it).
3. Update `ai-lab/CLAUDE.md` device registry note if onyx had a
"downloads role" annotation. (As of 2026-05-20 it does not — onyx has been
the staging host but is not formally documented as such.)
4. Optional: keep the `.torrent` files archive on onyx for 30 days as a
safety net.
---
## Rollback
If nullstone stack starts failing during phase 2:
- `docker compose down` on nullstone.
- Re-enable onyx qbt (Phase 1's stack is non-destructive — onyx torrents still
have their data + fastresume).
- File an issue + revisit phase 1 step 5 (kill-switch test).

59
docs/trackers.md Normal file
View file

@ -0,0 +1,59 @@
# Trackers — schema, IP-pinning, ratio notes
Single source of truth for what trackers feed this pipeline, and what their
quirks are. Per-tracker entries get added by the operator; the schema is
below.
## IP-pinning risk
Many private trackers **pin sessions to a single source IP**. Switching
from onyx public IP → Proton exit IP (via gluetun) will trip them: tracker
returns `unauthorized: source IP mismatch` on announce, the torrent stops
announcing → seeding stats halt → ratio decays.
Mitigations, ordered cheapest → most invasive:
1. **Read the tracker's FAQ first.** Most private trackers have a documented
policy: "1 IP, change requires staff" / "rolling IP allowed, contact us
after change" / "IP locked to account, no exceptions".
2. **Request an IP update** from staff before migrating that torrent.
Provide the new Proton exit IP (gluetun reports current exit via
`docker exec gluetun cat /tmp/gluetun/ip`).
3. **Hot-swap manually:** announce on onyx, immediately re-add on nullstone,
force-announce. Some trackers' anti-abuse is rate-limited and won't catch
the swap.
4. **Multiple exit profiles.** Run two gluetun containers with different
Proton server selections (one for tracker A, one for tracker B). Heavy.
If a tracker rejects all of the above, **leave that torrent on onyx**. The
migration is not all-or-nothing; some seedboxes will live forever on the
old host. Document the exception in the table below.
## Per-tracker schema
Use this table format in this file. **Sort alphabetically by tracker name.**
| Tracker | URL | Type | IP-Pinning | Ratio Required | Notes |
|--------------------|------------------------------|---------|-----------------------|----------------|--------------------------------|
| _example.tracker_ | https://_example.tracker_/ | private | locked, request swap | 1.0 over 30d | Staff respond on IRC in < 24h. |
| _public.example_ | http://_public.example_/ | public | n/a | n/a | No account, no ratio. |
(Replace the example rows with real trackers as they are onboarded.)
## Onboarding a new tracker
When adding a new private tracker:
1. Read the tracker's FAQ / rules. Record IP-pinning + ratio policy in the
table above.
2. Run `scripts/add-tracker.sh <name> <url>` to push it into Prowlarr. The
script prompts for cookies / API key as needed.
3. Add a row to the per-tracker table above. Commit.
4. Monitor first 24h: check Prowlarr → Indexer → Stats for failed-query rate.
> 10% failures → recheck the IP-pinning column.
## Public trackers
Public trackers (e.g. open BitTorrent indexers) have no IP-pinning concerns
but generally bad quality + slow speeds. List them sparingly; prefer private
trackers for the long tail of niche media.

74
scripts/add-tracker.sh Executable file
View file

@ -0,0 +1,74 @@
#!/usr/bin/env bash
# scripts/add-tracker.sh — register a tracker with Prowlarr via API.
#
# Usage:
# PROWLARR_API_KEY=xxx ./add-tracker.sh <indexer-name> <indexer-id>
#
# Where <indexer-id> is Prowlarr's internal ID for the indexer type (look it
# up via `curl /api/v1/indexer/schema` — see "Discovering indexer IDs" below).
#
# This script POSTs a minimal indexer config to Prowlarr. For trackers that
# need cookies / passkeys / 2FA, finish the setup in the Prowlarr webui
# afterwards.
#
# Pre-reqs:
# - Prowlarr container up.
# - PROWLARR_API_KEY exported (Prowlarr → Settings → General → Security).
# - PROWLARR_URL defaults to http://127.0.0.1:9696.
set -euo pipefail
NAME="${1:-}"
INDEXER_ID="${2:-}"
PROWLARR_URL="${PROWLARR_URL:-http://127.0.0.1:9696}"
PROWLARR_API_KEY="${PROWLARR_API_KEY:-}"
if [ -z "$NAME" ] || [ -z "$INDEXER_ID" ] || [ -z "$PROWLARR_API_KEY" ]; then
cat <<EOF >&2
Usage: PROWLARR_API_KEY=xxx $0 <indexer-name> <indexer-id>
Env:
PROWLARR_URL default: http://127.0.0.1:9696
PROWLARR_API_KEY required — Prowlarr → Settings → General → Security.
Discovering indexer IDs:
curl -s "\$PROWLARR_URL/api/v1/indexer/schema" \\
-H "X-Api-Key: \$PROWLARR_API_KEY" | \\
jq -r '.[] | "\\(.implementation)\\t\\(.implementationName)"' | sort -u
Find the row matching your tracker, then look up its integer id via the
full schema entry. Many private trackers use the "Cardigann" implementation
with a YAML config — see Prowlarr docs for the full attribute list.
EOF
exit 2
fi
echo "Querying schema for '$INDEXER_ID'..."
SCHEMA_JSON="$(curl -fsS \
-H "X-Api-Key: $PROWLARR_API_KEY" \
"$PROWLARR_URL/api/v1/indexer/schema" \
| jq --arg id "$INDEXER_ID" '.[] | select(.id == ($id | tonumber))')"
if [ -z "$SCHEMA_JSON" ]; then
echo "No schema entry for indexer id=$INDEXER_ID" >&2
exit 1
fi
# Override name with the user-provided one and keep all other fields as-is.
PAYLOAD="$(jq --arg name "$NAME" '. + {name: $name, enable: true}' <<<"$SCHEMA_JSON")"
echo "POSTing indexer config..."
RESPONSE="$(curl -fsS -X POST \
-H "X-Api-Key: $PROWLARR_API_KEY" \
-H "Content-Type: application/json" \
-d "$PAYLOAD" \
"$PROWLARR_URL/api/v1/indexer")"
NEW_ID="$(jq -r '.id' <<<"$RESPONSE")"
echo "OK — indexer '$NAME' added with id=$NEW_ID"
echo
echo "Next steps:"
echo " 1. Open Prowlarr → Indexers → $NAME → fill in cookies/passkey/API key."
echo " 2. Test indexer (Settings → Indexers → Test)."
echo " 3. Add a row to docs/trackers.md with IP-pinning + ratio notes."
echo " 4. Push to Sonarr/Radarr via Prowlarr's Apps → Sync."

96
scripts/killswitch-test.sh Executable file
View file

@ -0,0 +1,96 @@
#!/usr/bin/env bash
# scripts/killswitch-test.sh — verify gluetun blocks traffic when VPN drops.
#
# Test plan (per docs/architecture.md §b):
#
# 1. With gluetun UP, qbt MUST resolve api.ipify.org and return an IP
# that is NOT nullstone's WAN IP (i.e. Proton exit).
# 2. Stop gluetun. qbt's container MUST NOT be able to reach the internet
# (curl hangs / fails fast). If it succeeds → kill-switch leak → ABORT.
# 3. Restart gluetun, re-verify step 1.
#
# Run from anywhere with `docker` access on the host. Idempotent.
set -euo pipefail
GLUETUN="${GLUETUN_CONTAINER:-gluetun}"
QBT="${QBT_CONTAINER:-qbittorrent}"
TIMEOUT="${TIMEOUT:-8}"
red() { printf '\033[31m%s\033[0m\n' "$*"; }
green() { printf '\033[32m%s\033[0m\n' "$*"; }
yellow(){ printf '\033[33m%s\033[0m\n' "$*"; }
require_container() {
if ! docker inspect "$1" >/dev/null 2>&1; then
red "FAIL: container '$1' not found"; exit 1
fi
}
require_container "$GLUETUN"
require_container "$QBT"
# --- Step 0: discover nullstone's WAN IP (so we can detect leaks).
WAN_IP="$(curl -sf --max-time "$TIMEOUT" https://api.ipify.org || true)"
if [ -z "$WAN_IP" ]; then
yellow "WARN: could not fetch host WAN IP — leak detection will be best-effort"
else
echo "Host WAN IP: $WAN_IP"
fi
# --- Step 1: VPN-up check
echo
echo "Step 1: VPN-up — qbt should exit via Proton."
if ! docker inspect -f '{{.State.Running}}' "$GLUETUN" | grep -q true; then
yellow "gluetun not running — starting…"
docker start "$GLUETUN" >/dev/null
sleep 5
fi
VPN_IP="$(docker exec "$QBT" curl -sf --max-time "$TIMEOUT" https://api.ipify.org || true)"
if [ -z "$VPN_IP" ]; then
red "FAIL: qbt could not reach api.ipify.org even with gluetun up. Check VPN config."
exit 1
fi
echo "qbt sees IP: $VPN_IP"
if [ -n "$WAN_IP" ] && [ "$VPN_IP" = "$WAN_IP" ]; then
red "FAIL: qbt's IP == host WAN IP. Traffic is NOT going through VPN."
exit 1
fi
green "OK: qbt egressing via VPN."
# --- Step 2: kill-switch check
echo
echo "Step 2: kill-switch — stop gluetun, qbt MUST fail to reach internet."
docker stop "$GLUETUN" >/dev/null
sleep 2
set +e
LEAK_IP="$(docker exec "$QBT" curl -sf --max-time "$TIMEOUT" https://api.ipify.org 2>/dev/null)"
RC=$?
set -e
docker start "$GLUETUN" >/dev/null
if [ $RC -eq 0 ] && [ -n "$LEAK_IP" ]; then
red "FAIL: kill-switch broken — qbt reached the internet with VPN down."
red " Leaked IP: $LEAK_IP"
red " Tear down the stack and investigate before adding any torrents."
exit 1
fi
green "OK: qbt could not reach internet with gluetun stopped."
# --- Step 3: re-verify VPN comes back
echo
echo "Step 3: VPN restart — wait for gluetun to be healthy again."
for i in $(seq 1 30); do
if [ "$(docker inspect -f '{{.State.Health.Status}}' "$GLUETUN" 2>/dev/null || echo none)" = "healthy" ]; then
break
fi
sleep 2
done
RECOVERY_IP="$(docker exec "$QBT" curl -sf --max-time "$TIMEOUT" https://api.ipify.org || true)"
if [ -z "$RECOVERY_IP" ]; then
yellow "WARN: gluetun did not recover within 60s. Check 'docker logs $GLUETUN'."
exit 1
fi
green "OK: VPN recovered. qbt sees IP $RECOVERY_IP."
echo
green "Kill-switch test PASSED. Safe to seed."

109
scripts/migrate-onyx.sh Executable file
View file

@ -0,0 +1,109 @@
#!/usr/bin/env bash
# scripts/migrate-onyx.sh — Phase 2 migration helper.
#
# Usage:
# ./migrate-onyx.sh <source-dir-on-onyx> <target-dir-on-nullstone>
#
# Example:
# ./migrate-onyx.sh "$HOME/Downloads/qbt/" \
# /home/user/media/_downloads/complete/
#
# What it does:
# 1. rsync <source-dir> → user@nullstone:<target-dir> over LAN.
# 2. Copies onyx's .torrent + .fastresume files to /tmp/qbt-migrate/
# on nullstone (you mass-add them via qbt webui afterwards).
# 3. Prints a checklist of remaining manual steps.
#
# Pre-reqs:
# - run from onyx (the SOURCE machine).
# - ssh user@nullstone reachable on LAN.
# - nullstone qbt stack already up (Phase 1 complete) — check with
# `bash killswitch-test.sh` first.
set -euo pipefail
SRC="${1:-}"
DST="${2:-}"
NULLSTONE="${NULLSTONE_SSH:-user@192.168.0.100}"
DRY_RUN="${DRY_RUN:-1}"
usage() {
cat <<EOF
Usage: $0 <source-dir> <target-dir-on-nullstone>
Env:
NULLSTONE_SSH default: user@192.168.0.100
DRY_RUN default: 1 (rsync --dry-run). Set DRY_RUN=0 to actually copy.
Example (dry-run):
$0 "\$HOME/Downloads/qbt/" /home/user/media/_downloads/complete/
Example (real):
DRY_RUN=0 $0 "\$HOME/Downloads/qbt/" /home/user/media/_downloads/complete/
EOF
exit 2
}
[ -z "$SRC" ] || [ -z "$DST" ] && usage
[ -d "$SRC" ] || { echo "Source dir not found: $SRC" >&2; exit 1; }
QBT_BACKUP="$HOME/.local/share/qBittorrent/BT_backup"
[ -d "$QBT_BACKUP" ] || { echo "qBittorrent BT_backup dir missing: $QBT_BACKUP" >&2; exit 1; }
RSYNC_FLAGS=(-av --info=progress2 --partial --human-readable)
if [ "$DRY_RUN" = "1" ]; then
RSYNC_FLAGS+=(--dry-run)
echo "=== DRY-RUN — no data will be copied. Set DRY_RUN=0 to run for real. ==="
fi
echo "=== Step 1/3: rsync data files to nullstone ==="
rsync "${RSYNC_FLAGS[@]}" "$SRC" "$NULLSTONE:$DST"
echo
echo "=== Step 2/3: ship .torrent + .fastresume to nullstone /tmp/qbt-migrate/ ==="
ssh "$NULLSTONE" "mkdir -p /tmp/qbt-migrate"
if [ "$DRY_RUN" = "1" ]; then
echo "(dry-run) would scp $QBT_BACKUP/*.torrent $QBT_BACKUP/*.fastresume → $NULLSTONE:/tmp/qbt-migrate/"
else
scp -q "$QBT_BACKUP"/*.torrent "$QBT_BACKUP"/*.fastresume "$NULLSTONE:/tmp/qbt-migrate/"
fi
echo
echo "=== Step 3/3: remaining manual steps ==="
cat <<EOF
Manual steps on the nullstone qbt webui (https://qbt.s8n.ru):
1. "Add Torrent" → multi-select all files in /tmp/qbt-migrate/*.torrent
Save path: $DST
[ ] UNCHECK "Start torrent" — we want them queued, not auto-resumed.
2. Select all newly-added torrents → right-click → "Force recheck"
qbt will hash-match files in $DST → mark each at 100% → start seeding.
3. Check tracker status per torrent. Private trackers may reject the new
source IP (Proton exit). See docs/trackers.md for the per-tracker
mitigation playbook.
4. On onyx (THIS machine): pause torrents one-by-one as nullstone takes
over each. Do NOT stop onyx-qbt entirely until every torrent shows
seeding on nullstone for 24h with zero tracker errors.
Fastresume path-rewrite (optional, only if save_path drift breaks recheck):
ssh $NULLSTONE 'python3 - <<PYEOF
import os, re, sys
from pathlib import Path
src_prefix = "/home/admin/Downloads/qbt" # onyx path
dst_prefix = "/downloads/complete" # nullstone path inside qbt container
for fr in Path("/tmp/qbt-migrate").glob("*.fastresume"):
data = fr.read_bytes()
if src_prefix.encode() in data:
new = data.replace(src_prefix.encode(), dst_prefix.encode())
fr.write_bytes(new)
print("rewrote:", fr.name)
PYEOF'
(Only run if force-recheck fails — qbt 5 usually handles re-pathing via
the UI's "Save path" field on add.)
EOF