From 4301877f33761749d82f2761c33ac7fe7ec7d574 Mon Sep 17 00:00:00 2001 From: Alvin Wang Date: Wed, 22 Apr 2026 14:59:34 -0700 Subject: [PATCH] Update docs to match live cluster state Audited the running cluster and fixed all .md files: - Node info: Fedora 43, Lima (not OrbStack), worker IP 10.0.1.58 - Networking: fixed public/internal hostname tables, all *.dog internals - Storage: removed Longhorn refs (not deployed), documented hostPath/local-path - Services: moved Seerr to media chart, utils is Zerobyte only - Bootstrap: reordered steps, MetalLB/traefik-internal as manual pre-deploy - Headlamp.md/MetalLB.md: added context and explanations Made-with: Cursor --- AGENTS.md | 17 +++---- Headlamp.md | 20 +++++++- MetalLB.md | 16 ++++-- README.md | 141 +++++++++++++++++++++++++++++----------------------- 4 files changed, 117 insertions(+), 77 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index d8dcff3..66c6800 100755 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,10 +1,9 @@ -NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME -dogbox Ready control-plane 3h31m v1.34.6+k3s1 10.0.1.2 Fedora Linux 40 (Server Edition) 6.9.6-200.fc40.x86_64 containerd://2.2.2-bd1.34 -mac-worker Ready 3h13m v1.34.6+k3s1 192.168.139.12 Ubuntu 25.10 6.17.8-orbstack-00308-g8f9c941121b1 containerd://2.2.2-bd1.34 +NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME +localhost.localdomain Ready control-plane v1.34.6+k3s1 10.0.1.2 Fedora Linux 43 (Server Edition) 6.17.1-300.fc43.x86_64 containerd://2.2.2-bd1.34 +lima-mac-worker Ready v1.34.6+k3s1 10.0.1.58 Ubuntu 25.10 6.17.0-22-generic containerd://2.2.2-bd1.34 -The mac-worker is running inside orbstack linux VM if that matters. - +The mac-worker is running inside a Lima VM on macOS. I have a DNS rewrite pointing *.internal to 10.0.1.250 which is traefik-internal. @@ -23,10 +22,10 @@ separated by `loadBalancerClass` so they don't conflict. `autoAssign: false`, so it only assigns IPs to services that explicitly request a pool via the `metallb.io/address-pool` annotation. -| Service | loadBalancerClass | LB | External IPs | -|------------------|-------------------|----------|-------------------------| -| traefik | (none) | klipper | node IPs (10.0.1.2 etc) | -| traefik-internal | metallb | MetalLB | 10.0.1.250 | +| Service | loadBalancerClass | LB | External IPs | +|------------------|-------------------|----------|---------------------------| +| traefik | (none) | klipper | 10.0.1.2, 10.0.1.58 | +| traefik-internal | metallb | MetalLB | 10.0.1.250 | `loadBalancerClass` is immutable on k8s Services. Changing it requires deleting the Service first, then redeploying (`kubectl delete svc … && helm upgrade`). diff --git a/Headlamp.md b/Headlamp.md index 272d9b1..f940536 100644 --- a/Headlamp.md +++ b/Headlamp.md @@ -1,6 +1,22 @@ +# Headlamp — Manual Token Access + +The `charts/headlamp` Helm chart deploys Headlamp with its own in-cluster +ServiceAccount (`headlamp`) and a `cluster-admin` ClusterRoleBinding. That +SA is used by the running pod and does not require manual setup. + +To generate a **bearer token** for logging in to the Headlamp UI (e.g. from +a browser), create a separate short-lived token: + +```bash +kubectl -n apps create token headlamp --duration=48h +``` + +If you need a dedicated SA for external/long-lived access instead: + +```bash kubectl -n apps create serviceaccount headlamp-admin kubectl create clusterrolebinding headlamp-admin \ --serviceaccount=apps:headlamp-admin \ --clusterrole=cluster-admin - - kubectl -n apps create token headlamp-admin +kubectl -n apps create token headlamp-admin +``` diff --git a/MetalLB.md b/MetalLB.md index 04fa49d..c17f47b 100644 --- a/MetalLB.md +++ b/MetalLB.md @@ -1,7 +1,17 @@ +# MetalLB — Manual Setup + +MetalLB is **not** included in `bootstrap.sh deploy`. It must be installed +manually before deploying `traefik-internal` (which depends on the MetalLB +`loadBalancerClass`). + +```bash helm repo add metallb https://metallb.github.io/metallb helm repo update -helm search repo metallb helm dependency build charts/metallb - - helm upgrade --install metallb charts/metallb -n kube-system --wait +``` + +The chart wraps the upstream MetalLB subchart and adds a custom +`IPAddressPool` + `L2Advertisement` (defined in `charts/metallb/templates/pool.yaml`). +The pool assigns a single IP (`10.0.1.250`) with `autoAssign: false`, so only +services that explicitly request the `internal` pool via annotation get that IP. diff --git a/README.md b/README.md index 9c1a058..5087e20 100755 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # Homelab — k3s Cluster -2-node k3s cluster (1 manager, 1 worker) running a self-hosted homelab stack on `ratboo.me`. +2-node k3s cluster (1 control-plane, 1 worker) running a self-hosted homelab stack on `ratboo.me`. ## Architecture @@ -8,8 +8,8 @@ | Node | Role | OS | IP | Runtime | |------|------|----|----|---------| -| **dogbox** | control-plane | Fedora 40 Server | `10.0.1.2` | k3s server + containerd | -| **mac-worker** | worker | Ubuntu 25.10 (OrbStack VM) | `192.168.139.12` | k3s agent + containerd | +| **localhost.localdomain** (dogbox) | control-plane | Fedora Linux 43 (Server Edition) | `10.0.1.2` | k3s v1.34.6 + containerd | +| **lima-mac-worker** | worker | Ubuntu 25.10 (Lima VM on macOS) | `10.0.1.58` | k3s v1.34.6 + containerd | ### Overview @@ -20,62 +20,70 @@ *.ratboo.me │ ┌──────────────────────────┼──────────────────────────┐ - │ dogbox (manager) │ - │ Fedora 40 · 10.0.1.2 │ - │ │ - │ ┌─────────────────┐ ┌──────────────────────┐ │ - │ │ k3s server │ │ Traefik (k3s) │ │ - │ │ control-plane │ │ :443 websecure │ │ - │ └─────────────────┘ │ Let's Encrypt + CF │ │ - │ └──────────┬───────────┘ │ - │ ┌─────────────────┐ │ │ - │ │ traefik-internal │ Routes to pods across │ - │ │ :80 LB 10.0.1.250│ both nodes via CNI │ - │ │ (MetalLB L2) │ │ │ - │ └─────────────────┘ │ │ - │ Longhorn │ │ - └──────────────┬─────────────────────┼─────────────────┘ - │ │ - NFS /dogstore k3s cluster - │ │ - ┌──────────────┴─────────────────────┼─────────────────┐ - │ mac-worker (worker) │ - │ Ubuntu 25.10 · OrbStack VM │ - │ 192.168.139.12 │ + │ localhost.localdomain (dogbox) │ + │ Fedora 43 · 10.0.1.2 │ │ │ - │ Longhorn · workload pods │ - └──────────────────────────────────────────────────────┘ + │ ┌─────────────────┐ ┌──────────────────────┐ │ + │ │ k3s server │ │ Traefik (k3s) │ │ + │ │ control-plane │ │ :443 websecure │ │ + │ └─────────────────┘ │ Let's Encrypt + CF │ │ + │ └──────────┬───────────┘ │ + │ ┌──────────────────┐ │ │ + │ │ traefik-internal │ Routes to pods across │ + │ │ :80/:443 MetalLB │ both nodes via CNI │ + │ │ LB 10.0.1.250 │ │ │ + │ └──────────────────┘ │ │ + └───────────────┬───────────────────┼──────────────────┘ + │ │ + NFS /dogstore k3s cluster + │ │ + ┌───────────────┴───────────────────┼──────────────────┐ + │ lima-mac-worker (worker) │ + │ Ubuntu 25.10 · Lima VM on macOS │ + │ 10.0.1.58 │ + │ │ + │ workload pods │ + └───────────────────────────────────────────────────────┘ ``` ### Networking -**Public ingress** — k3s bundles Traefik, configured via `HelmChartConfig` in `traefik-config`. TLS terminates at Traefik using Let's Encrypt with Cloudflare DNS-01 challenge. HTTP automatically redirects to HTTPS. +**Public ingress** — k3s bundles Traefik, configured via `HelmChartConfig` in `traefik-config`. TLS terminates at Traefik using Let's Encrypt with Cloudflare DNS-01 challenge. HTTP automatically redirects to HTTPS. klipper (servicelb) exposes the public Traefik on every node IP. | Public hostname | Service | |-----------------|---------| | `plex.ratboo.me` | Plex | -| `sonarr.ratboo.me` | Sonarr | -| `radarr.ratboo.me` | Radarr | +| `watch.ratboo.me` | Seerr | | `paperless.ratboo.me` | Paperless-ngx | | `mealie.ratboo.me` | Mealie | -| `watch.ratboo.me` | Seerr | -**Internal ingress** — A separate Traefik instance (`traefik-internal`) listens on `10.0.1.250:80`, served by MetalLB L2. A DNS rewrite points `*.internal` to that IP. Internal services use Traefik `IngressRoute` CRDs with `ingressClass: traefik-internal`. +**Internal ingress** — A separate Traefik instance (`traefik-internal`) listens on `10.0.1.250` (ports 80 and 443), served by MetalLB L2. A DNS rewrite points `*.internal` to that IP. Internal services use Traefik `IngressRoute` CRDs with `ingressClass: traefik-internal`. Every service with a `*-ingressroute.yaml` template gets an `*.dog` hostname on this Traefik. | Internal hostname | Service | |-------------------|---------| -| `homepage.rat` | Homepage | -| `glance.rat` | Glance | +| `plex.dog` | Plex | +| `sonarr.dog` | Sonarr | +| `radarr.dog` | Radarr | +| `bazarr.dog` | Bazarr | +| `prowlarr.dog` | Prowlarr | +| `qbittorrent.dog` | qBittorrent | +| `seerr.dog` | Seerr | +| `paperless.dog` | Paperless-ngx | +| `mealie.dog` | Mealie | +| `homepage.dog` | Homepage | +| `glance.dog` | Glance | | `headlamp.dog` | Headlamp | +| `zerobyte.dog` | Zerobyte | -**Cluster-only (no ingress):** Prowlarr, Bazarr, qBittorrent, Zerobyte. +**No ingress:** unpackerr (background download-extraction daemon, no web UI). ### Storage | Mechanism | Use | |-----------|-----| -| **Longhorn** (`storageClass: longhorn`, replica count 2) | Small config/state PVCs — Traefik ACME (128Mi), app configs (1–20Gi), Paperless Postgres/Redis, Mealie data, Seerr, Zerobyte | -| **NFS via hostPath `/dogstore`** | Large/shared data — Plex media + transcode, Sonarr/Radarr/qBittorrent/unpackerr data trees, Paperless documents, Homepage/Glance configs | +| **NFS via hostPath `/dogstore`** | Large/shared data — Plex media + transcode, Sonarr/Radarr/qBittorrent/unpackerr data trees, Paperless documents, Homepage/Glance configs, ACME cert storage | +| **hostPath `/home/alvin/service-data`** | App config directories on dogbox (Seerr, etc.) | +| **local-path (default StorageClass)** | k3s built-in provisioner for any PVCs (rancher.io/local-path) | ### Secrets @@ -85,24 +93,23 @@ SOPS + age encryption. All secrets live in `secrets/secrets.enc.yaml`, encrypted | Namespace | Contents | |-----------|----------| -| `kube-system` | k3s Traefik, `traefik-config` (HelmChartConfig + redirect middleware) | -| `longhorn-system` | Longhorn storage | -| `media` | Plex, Sonarr, Radarr, Bazarr, Prowlarr, qBittorrent, unpackerr | -| `apps` | Mealie, Homepage, Glance, Headlamp, Seerr, Zerobyte, Paperless-ngx + Postgres + Redis | +| `kube-system` | k3s Traefik + `traefik-config` (HelmChartConfig + redirect middleware), `traefik-internal`, MetalLB controller + speakers, CoreDNS, metrics-server | +| `media` | Plex, Sonarr, Radarr, Bazarr, Prowlarr, qBittorrent, unpackerr, Seerr | +| `apps` | Paperless-ngx + Postgres + Redis, Mealie, Homepage, Glance, Headlamp, Zerobyte | ## Services | Chart | Namespace | Services | Notes | |-------|-----------|----------|-------| -| traefik-config | kube-system | Traefik HelmChartConfig overlay | Cloudflare DNS-01, ACME on Longhorn | -| traefik-internal | — | Internal Traefik instance | LB via MetalLB at `10.0.1.250` | -| metallb | — | MetalLB L2 pool | Single-IP pool for internal LB | -| media | media | Plex, Sonarr, Radarr, Bazarr, Prowlarr, qBittorrent, unpackerr | Media stack with `/dogstore` data paths | +| traefik-config | kube-system | Traefik HelmChartConfig overlay | Cloudflare DNS-01, ACME on hostPath `/dogstore` | +| traefik-internal | kube-system | Internal Traefik instance | LB via MetalLB at `10.0.1.250`, ports 80/443/9095 | +| metallb | kube-system | MetalLB L2 pool | Single-IP pool (`10.0.1.250`) for internal LB | +| media | media | Plex, Sonarr, Radarr, Bazarr, Prowlarr, qBittorrent, unpackerr, Seerr | Media stack with `/dogstore` data paths | | paperless | apps | Paperless-ngx, Redis, PostgreSQL | Postgres 15, Redis 7 | -| mealie | apps | Mealie (v3.14.0) | Gemini API integration for recipes | +| mealie | apps | Mealie (v3.16.0) | Gemini API integration for recipes | | dashboards | apps | Homepage, Glance | Internal-only via `traefik-internal` | | headlamp | apps | Headlamp | K8s dashboard, internal-only via `traefik-internal` | -| utils | apps | Seerr, Zerobyte | Seerr public, Zerobyte cluster-only | +| utils | apps | Zerobyte | Backup service, internal-only via `traefik-internal` | ## Prerequisites @@ -126,13 +133,7 @@ This prints the worker join command at the end. K3S_URL="https://:6443" K3S_TOKEN="" ./scripts/bootstrap.sh agent ``` -### 3. Install Longhorn - -```bash -./scripts/bootstrap.sh longhorn -``` - -### 4. Set up SOPS encryption +### 3. Set up SOPS encryption Generate an age keypair (run on each node): @@ -147,39 +148,53 @@ Copy the public key into `.sops.yaml`, replacing the placeholder. Then encrypt y sops -e -i secrets/secrets.enc.yaml ``` -### 5. Apply secrets +### 4. Apply secrets ```bash ./scripts/bootstrap.sh apply-secrets ``` -### 6. Deploy all charts +### 5. Deploy MetalLB and internal Traefik (manual) + +These are deployed separately before the main charts because other services depend on them: + +```bash +helm dependency build charts/metallb +helm upgrade --install metallb charts/metallb -n kube-system --wait +helm upgrade --install traefik-internal charts/traefik-internal -n kube-system --wait +``` + +### 6. Deploy all application charts ```bash ./scripts/bootstrap.sh deploy ``` +This installs (in order): `traefik-config`, `media`, `paperless`, `mealie`, `dashboards`, `utils`, `headlamp`. + Or deploy individually: ```bash -helm upgrade --install metallb charts/metallb -n kube-system --wait -helm upgrade --install traefik-internal charts/traefik-internal -n kube-system --wait # Traefik config goes in kube-system (managed by k3s) helm upgrade --install traefik-config charts/traefik-config -n kube-system kubectl create namespace apps helm upgrade --install headlamp charts/headlamp -n apps - helm upgrade --install dashboards charts/dashboards -n apps helm upgrade --install paperless charts/paperless -n apps helm upgrade --install mealie charts/mealie -n apps +helm upgrade --install utils charts/utils -n apps kubectl create namespace media helm upgrade --install media charts/media -n media +``` -helm upgrade --install utils charts/utils -n apps +### Optional: Install Longhorn +The bootstrap script includes a Longhorn install command, but it is not currently deployed: +```bash +./scripts/bootstrap.sh longhorn ``` ## Verifying @@ -190,6 +205,7 @@ kubectl get pods -A # Check ingress routes kubectl get ingress -A +kubectl get ingressroute -A # Test a specific service curl -I https://mealie.ratboo.me @@ -216,13 +232,12 @@ homelab/ │ ├── traefik-config/ # k3s Traefik overrides (HelmChartConfig) │ ├── traefik-internal/ # Separate internal Traefik instance │ ├── metallb/ # MetalLB L2 for internal LB IP -│ ├── media/ # Plex, *arr stack, qBittorrent, unpackerr +│ ├── media/ # Plex, Sonarr, Radarr, Bazarr, Prowlarr, qBittorrent, unpackerr, Seerr │ ├── paperless/ # Paperless-ngx + Postgres + Redis │ ├── mealie/ # Mealie recipe manager │ ├── dashboards/ # Homepage + Glance (internal only) │ ├── headlamp/ # Headlamp K8s dashboard (internal only) -│ └── utils/ # Seerr + Zerobyte +│ └── utils/ # Zerobyte backup └── secrets/ └── secrets.enc.yaml ``` -