feat(infra): k3s foundation — cert-manager, Longhorn config, in-cluster data tier
ci / typecheck (map[dir:apps/website name:website]) (push) Failing after 10m58s
ci / typecheck (map[dir:apps/portal name:portal]) (push) Failing after 11m56s
ci / typecheck (map[dir:apps/booking name:booking]) (push) Failing after 14m0s
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Has been cancelled
ci / test (push) Has been cancelled
ci / typecheck (map[dir:apps/website name:website]) (push) Failing after 10m58s
ci / typecheck (map[dir:apps/portal name:portal]) (push) Failing after 11m56s
ci / typecheck (map[dir:apps/booking name:booking]) (push) Failing after 14m0s
ci / typecheck (map[dir:services/platform-api name:platform-api]) (push) Has been cancelled
ci / test (push) Has been cancelled
Adds the production cluster foundation (authored + applied live on node1): - cert-manager via the k3s HelmChart controller + letsencrypt staging/prod ClusterIssuers (HTTP-01 / Traefik). - Longhorn config for single-node (values: replica=1, default StorageClass, Retain) + backup-to-Hetzner-Object-Storage credential template. - In-cluster data tier (dezky-data): Postgres 16 (with Authentik+OCIS DB init), MongoDB 7, Redis 7 as StatefulSets on Longhorn, + secret template. - bootstrap.sh: install open-iscsi/nfs-common + enable iscsid (Longhorn prereq). - RUNBOOK.md: full reproducible node1 build order. Real secrets are generated on-box and kept in Bitwarden — never in git.
This commit is contained in:
@@ -0,0 +1,29 @@
|
||||
# fleet/cert-manager — TLS for the cluster
|
||||
|
||||
cert-manager + ACME ClusterIssuers. Installs via the **k3s built-in Helm
|
||||
controller** (no Helm CLI needed), then defines `letsencrypt-staging` and
|
||||
`letsencrypt-prod` (HTTP-01 through the bundled Traefik).
|
||||
|
||||
## Apply order (matters — issuers need the CRDs first)
|
||||
|
||||
```bash
|
||||
# 1) Install cert-manager
|
||||
kubectl apply -f cert-manager.yaml
|
||||
|
||||
# 2) Wait until it's up (CRDs + webhook ready)
|
||||
kubectl -n cert-manager rollout status deploy/cert-manager-webhook --timeout=180s
|
||||
kubectl -n cert-manager get pods
|
||||
|
||||
# 3) Create the issuers
|
||||
kubectl apply -f cluster-issuer.yaml
|
||||
kubectl get clusterissuer # both should report READY=True
|
||||
```
|
||||
|
||||
## Notes
|
||||
- ACME email is `info@dezky.eu` — change in `cluster-issuer.yaml` if needed.
|
||||
- **Test with `letsencrypt-staging` first** (set an Ingress annotation
|
||||
`cert-manager.io/cluster-issuer: letsencrypt-staging`) to avoid burning the
|
||||
strict prod rate limits, then switch the apps to `letsencrypt-prod`.
|
||||
- HTTP-01 requires each hostname's DNS A record → `46.4.78.187` and port 80
|
||||
open (already true). A cert won't issue until DNS resolves.
|
||||
- The app Ingresses (`fleet/apps/`) already reference `letsencrypt-prod`.
|
||||
@@ -0,0 +1,37 @@
|
||||
# cert-manager, installed via the k3s built-in Helm controller
|
||||
# (helm.cattle.io/v1). k3s watches HelmChart resources in any namespace and
|
||||
# runs a `helm install` Job for them — no Helm CLI needed on your laptop.
|
||||
#
|
||||
# The chart installs its own CRDs (crds.enabled=true). Apply this first and
|
||||
# wait for the cert-manager pods to be Running/Ready before applying the
|
||||
# ClusterIssuers (cluster-issuer.yaml) — the issuers need the CRDs + webhook.
|
||||
apiVersion: helm.cattle.io/v1
|
||||
kind: HelmChart
|
||||
metadata:
|
||||
name: cert-manager
|
||||
namespace: kube-system
|
||||
spec:
|
||||
repo: https://charts.jetstack.io
|
||||
chart: cert-manager
|
||||
# Pin a version; bump to the latest stable when you upgrade.
|
||||
version: v1.16.2
|
||||
targetNamespace: cert-manager
|
||||
createNamespace: true
|
||||
valuesContent: |-
|
||||
crds:
|
||||
enabled: true
|
||||
# Single-node box — keep the footprint modest.
|
||||
resources:
|
||||
requests:
|
||||
cpu: 10m
|
||||
memory: 64Mi
|
||||
webhook:
|
||||
resources:
|
||||
requests:
|
||||
cpu: 10m
|
||||
memory: 32Mi
|
||||
cainjector:
|
||||
resources:
|
||||
requests:
|
||||
cpu: 10m
|
||||
memory: 64Mi
|
||||
@@ -0,0 +1,43 @@
|
||||
# ACME ClusterIssuers (HTTP-01 via the k3s-bundled Traefik ingress).
|
||||
#
|
||||
# Apply ONLY after cert-manager is Running:
|
||||
# kubectl -n cert-manager rollout status deploy/cert-manager-webhook
|
||||
#
|
||||
# Two issuers:
|
||||
# - letsencrypt-staging : use while testing (high rate limits, UNTRUSTED
|
||||
# certs). Point an Ingress at this first to prove the HTTP-01 flow works.
|
||||
# - letsencrypt-prod : the real one the app Ingresses reference. Switch to
|
||||
# it once staging issues cleanly, to avoid burning Let's Encrypt's strict
|
||||
# prod rate limits on misconfigurations.
|
||||
#
|
||||
# HTTP-01 needs the hostname to resolve to this box (DNS A record -> 46.4.78.187)
|
||||
# and port 80 reachable — both are already true (firewall opens 80 to the world).
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: letsencrypt-staging
|
||||
spec:
|
||||
acme:
|
||||
server: https://acme-staging-v02.api.letsencrypt.org/directory
|
||||
email: info@dezky.eu
|
||||
privateKeySecretRef:
|
||||
name: letsencrypt-staging-account-key
|
||||
solvers:
|
||||
- http01:
|
||||
ingress:
|
||||
class: traefik
|
||||
---
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: letsencrypt-prod
|
||||
spec:
|
||||
acme:
|
||||
server: https://acme-v02.api.letsencrypt.org/directory
|
||||
email: info@dezky.eu
|
||||
privateKeySecretRef:
|
||||
name: letsencrypt-prod-account-key
|
||||
solvers:
|
||||
- http01:
|
||||
ingress:
|
||||
class: traefik
|
||||
Reference in New Issue
Block a user