homelab/README.md
2026-04-22 16:51:54 -07:00

245 lines
9.7 KiB
Markdown
Executable File

# Homelab — k3s Cluster
2-node k3s cluster (1 control-plane, 1 worker) running a self-hosted homelab stack on `ratboo.me`.
## Architecture
### Nodes
| Node | Role | OS | IP | Runtime |
|------|------|----|----|---------|
| **localhost.localdomain** (dogbox) | control-plane | Fedora Linux 43 (Server Edition) | `10.0.1.2` | k3s v1.34.6 + containerd |
| **lima-mac-worker** | worker | Ubuntu 25.10 (Lima VM on macOS) | `10.0.1.58` | k3s v1.34.6 + containerd |
### Overview
```
Internet
Cloudflare DNS
*.ratboo.me
┌──────────────────────────┼──────────────────────────┐
│ localhost.localdomain (dogbox) │
│ Fedora 43 · 10.0.1.2 │
│ │
│ ┌─────────────────┐ ┌──────────────────────┐ │
│ │ k3s server │ │ Traefik (k3s) │ │
│ │ control-plane │ │ :443 websecure │ │
│ └─────────────────┘ │ Let's Encrypt + CF │ │
│ └──────────┬───────────┘ │
│ ┌──────────────────┐ │ │
│ │ traefik-internal │ Routes to pods across │
│ │ :80/:443 MetalLB │ both nodes via CNI │
│ │ LB 10.0.1.250 │ │ │
│ └──────────────────┘ │ │
└───────────────┬───────────────────┼──────────────────┘
│ │
NFS /dogstore k3s cluster
│ │
┌───────────────┴───────────────────┼──────────────────┐
│ lima-mac-worker (worker) │
│ Ubuntu 25.10 · Lima VM on macOS │
│ 10.0.1.58 │
│ │
│ workload pods │
└───────────────────────────────────────────────────────┘
```
### Networking
**Public ingress** — k3s bundles Traefik, configured via `HelmChartConfig` in `traefik-config`. TLS terminates at Traefik using Let's Encrypt with Cloudflare DNS-01 challenge. HTTP automatically redirects to HTTPS. klipper (servicelb) exposes the public Traefik on every node IP.
| Public hostname | Service |
|-----------------|---------|
| `plex.ratboo.me` | Plex |
| `watch.ratboo.me` | Seerr |
| `paperless.ratboo.me` | Paperless-ngx |
| `mealie.ratboo.me` | Mealie |
**Internal ingress** — A separate Traefik instance (`traefik-internal`) listens on `10.0.1.250` (ports 80 and 443), served by MetalLB L2. A DNS rewrite points `*.internal` to that IP. Internal services use Traefik `IngressRoute` CRDs with `ingressClass: traefik-internal`. Every service with a `*-ingressroute.yaml` template gets an `*.dog` hostname on this Traefik.
| Internal hostname | Service |
|-------------------|---------|
| `plex.dog` | Plex |
| `sonarr.dog` | Sonarr |
| `radarr.dog` | Radarr |
| `bazarr.dog` | Bazarr |
| `prowlarr.dog` | Prowlarr |
| `qbittorrent.dog` | qBittorrent |
| `seerr.dog` | Seerr |
| `paperless.dog` | Paperless-ngx |
| `mealie.dog` | Mealie |
| `homepage.dog` | Homepage |
| `glance.dog` | Glance |
| `headlamp.dog` | Headlamp |
| `zerobyte.dog` | Zerobyte |
**No ingress:** unpackerr (background download-extraction daemon, no web UI).
### Storage
| Mechanism | Use |
|-----------|-----|
| **NFS via hostPath `/dogstore`** | Large/shared data — Plex media + transcode, Sonarr/Radarr/qBittorrent/unpackerr data trees, Paperless documents, Homepage/Glance configs, ACME cert storage |
| **hostPath `/home/alvin/service-data`** | App config directories on dogbox (Seerr, etc.) |
| **local-path (default StorageClass)** | k3s built-in provisioner for any PVCs (rancher.io/local-path) |
### Secrets
SOPS + age encryption. All secrets live in `secrets/secrets.enc.yaml`, encrypted at rest. The age key lives at `/etc/sops/age/keys.txt` on each node. Referenced secrets include Cloudflare API tokens, database passwords, Plex claim tokens, and application API keys.
## Namespaces
| Namespace | Contents |
|-----------|----------|
| `kube-system` | k3s Traefik + `traefik-config` (HelmChartConfig + redirect middleware), `traefik-internal`, MetalLB controller + speakers, CoreDNS, metrics-server |
| `media` | Plex, Sonarr, Radarr, Bazarr, Prowlarr, qBittorrent, unpackerr, Seerr |
| `apps` | Paperless-ngx + Postgres + Redis, Mealie, Homepage, Glance, Headlamp, Zerobyte |
## Services
| Chart | Namespace | Services | Notes |
|-------|-----------|----------|-------|
| traefik-config | kube-system | Traefik HelmChartConfig overlay | Cloudflare DNS-01, ACME on hostPath `/dogstore` |
| traefik-internal | kube-system | Internal Traefik instance | LB via MetalLB at `10.0.1.250`, ports 80/443/9095 |
| metallb | kube-system | MetalLB L2 pool | Single-IP pool (`10.0.1.250`) for internal LB |
| media | media | Plex, Sonarr, Radarr, Bazarr, Prowlarr, qBittorrent, unpackerr, Seerr | Media stack with `/dogstore` data paths |
| paperless | apps | Paperless-ngx, Redis, PostgreSQL | Postgres 15, Redis 7 |
| mealie | apps | Mealie (v3.16.0) | Gemini API integration for recipes |
| dashboards | apps | Homepage, Glance | Internal-only via `traefik-internal` |
| headlamp | apps | Headlamp | K8s dashboard, internal-only via `traefik-internal` |
| utils | apps | Zerobyte | Backup service, internal-only via `traefik-internal` |
## Prerequisites
- Two Linux machines with NFS `/dogstore` mounted on both
- `curl`, `helm`, `kubectl`, `sops`, `age` installed
## Bootstrap
### 1. Install k3s server (manager node)
```bash
./scripts/bootstrap.sh server
```
This prints the worker join command at the end.
### 2. Install k3s agent (worker node)
```bash
K3S_URL="https://<manager-ip>:6443" K3S_TOKEN="<token>" ./scripts/bootstrap.sh agent
```
### 3. Set up SOPS encryption
Generate an age keypair (run on each node):
```bash
./scripts/bootstrap.sh sops-keygen
```
Copy the public key into `.sops.yaml`, replacing the placeholder. Then encrypt your secrets:
```bash
# Edit secrets/secrets.enc.yaml — replace REPLACE_WITH_* placeholders with real values
sops -e -i secrets/secrets.enc.yaml
```
### 4. Apply secrets
```bash
./scripts/bootstrap.sh apply-secrets
```
### 5. Deploy MetalLB and internal Traefik (manual)
These are deployed separately before the main charts because other services depend on them:
```bash
helm dependency build charts/metallb
helm upgrade --install metallb charts/metallb -n kube-system --wait
helm upgrade --install traefik-internal charts/traefik-internal -n kube-system --wait
```
### 6. Deploy all application charts
```bash
./scripts/bootstrap.sh deploy
```
This installs (in order): `traefik-config`, `media`, `paperless`, `mealie`, `dashboards`, `utils`, `headlamp`.
Or deploy individually:
```bash
# Traefik config goes in kube-system (managed by k3s)
helm upgrade --install traefik-config charts/traefik-config -n kube-system
kubectl create namespace apps
helm upgrade --install headlamp charts/headlamp -n apps
helm upgrade --install dashboards charts/dashboards -n apps
helm upgrade --install paperless charts/paperless -n apps
helm upgrade --install mealie charts/mealie -n apps
helm upgrade --install utils charts/utils -n apps
helm upgrade --install gitea charts/gitea -n apps
kubectl create namespace media
helm upgrade --install media charts/media -n media
```
### Optional: Install Longhorn
The bootstrap script includes a Longhorn install command, but it is not currently deployed:
```bash
./scripts/bootstrap.sh longhorn
```
## Verifying
```bash
# Check all pods
kubectl get pods -A
# Check ingress routes
kubectl get ingress -A
kubectl get ingressroute -A
# Test a specific service
curl -I https://mealie.ratboo.me
```
## Secret Rotation
1. Decrypt: `sops secrets/secrets.enc.yaml` (opens in `$EDITOR`)
2. Change the values
3. Save and close (SOPS re-encrypts automatically)
4. Apply: `./scripts/bootstrap.sh apply-secrets`
5. Restart affected pods: `kubectl rollout restart deployment/<name> -n <namespace>`
## Repo Structure
```
homelab/
├── README.md
├── AGENTS.md
├── .sops.yaml
├── scripts/
│ └── bootstrap.sh
├── charts/
│ ├── traefik-config/ # k3s Traefik overrides (HelmChartConfig)
│ ├── traefik-internal/ # Separate internal Traefik instance
│ ├── metallb/ # MetalLB L2 for internal LB IP
│ ├── media/ # Plex, Sonarr, Radarr, Bazarr, Prowlarr, qBittorrent, unpackerr, Seerr
│ ├── paperless/ # Paperless-ngx + Postgres + Redis
│ ├── mealie/ # Mealie recipe manager
│ ├── dashboards/ # Homepage + Glance (internal only)
│ ├── headlamp/ # Headlamp K8s dashboard (internal only)
│ └── utils/ # Zerobyte backup
└── secrets/
└── secrets.enc.yaml
```