Target environment¶
This chapter describes the exact stack we deploy onto and the reasoning behind each piece. The constraints these choices impose get their own next chapter.
The cluster: k3s on Hetzner via Terraform¶
We provision the cluster with the
terraform-hcloud-kube-hetzner
module. It is a well-maintained, opinionated way to run Kubernetes cheaply on
Hetzner Cloud. Key facts:
- k3s — a certified, lightweight Kubernetes distribution. It tracks upstream
Kubernetes closely; the module currently targets a recent release (k3s ~v1.35,
i.e. Kubernetes 1.35), which is comfortably new enough for every feature in
this guide. Verify yours with
kubectl version. - openSUSE MicroOS — an immutable, auto-updating container OS. The root filesystem is read-only and updates are transactional with automatic rollback.
- Auto-upgrades — node OS reboots are coordinated by Kured, and k3s upgrades by the system-upgrade-controller. This is a feature, but for a stateful database it has real consequences (covered next chapter).
flowchart TB
tf["kube.tf (Terraform/OpenTofu)"] -->|terraform apply| hc["Hetzner Cloud API"]
hc --> n1["Node 1 (amd64, MicroOS)"]
hc --> n2["Node 2 (amd64, MicroOS)"]
hc --> n3["Node 3 (amd64, MicroOS)"]
n1 & n2 & n3 --> k3s["k3s cluster"]
k3s --> addons["cert-manager + Longhorn<br/>(installed by the module)"]
Our cluster is 3 amd64 nodes in a single Hetzner location. Same architecture everywhere (no x86/ARM mix) and one location keeps storage simple — both reasons are explained in Constraints & decisions.
Storage: Longhorn on node storage¶
The module can give you either Hetzner's block-storage CSI driver or Longhorn, a distributed block storage system for Kubernetes. We chose Longhorn, configured to use node-local storage (fast) rather than attached Hetzner volumes (slower). The module's own example file recommends exactly this for databases.
Why Longhorn and not the Hetzner CSI?
- The Hetzner CSI driver does not support volume snapshots, removing a useful recovery option.
- Longhorn does support CSI volume snapshots and backups to S3-compatible object storage (including R2), giving us an independent safety net.
We deliberately keep Longhorn's redundancy low (1 replica) for the database, because PostgreSQL already keeps its own copies. The reasoning — and the trade-off — is in the next chapter.
TLS plumbing: cert-manager¶
cert-manager is the standard Kubernetes tool for issuing and renewing TLS certificates. We need it for one specific reason: the Barman Cloud Plugin requires cert-manager to secure the TLS channel between the plugin and the operator. The kube-hetzner module can install cert-manager for us, so it becomes part of Layer 1 rather than a manual step.
Backups: Cloudflare R2¶
Cloudflare R2 is S3-compatible object storage with no egress fees. CloudNativePG (via the Barman Cloud Plugin) writes base backups and archived WAL to it.
flowchart LR
pg["PostgreSQL Cluster"] -->|base backup + WAL| r2[("Cloudflare R2 bucket")]
r2 -->|restore / PITR| new["A fresh Cluster<br/>(bootstrap from backup)"]
R2 caveat, stated up front
R2 works with the Barman Cloud Plugin but needs a known workaround for an S3 checksum incompatibility, and there is a reported failure restoring from R2 with the plugin. We use it, but we prove restore works before relying on it, and we keep Longhorn's own R2 backups as a fallback. Details in Disaster recovery.
The three-layer model (how it all fits)¶
flowchart TB
subgraph L1["Layer 1 — Platform (Terraform / kube.tf)"]
nodes["3 nodes + k3s"]
cm["cert-manager"]
lh["Longhorn + longhorn-postgres StorageClass"]
end
subgraph L2["Layer 2 — Postgres stack (kubectl, then manifests)"]
oper["CNPG operator"]
bcp["Barman Cloud Plugin"]
icat["ImageCatalog"]
os["ObjectStore (R2)"]
clu["Cluster + Pooler + ScheduledBackup + NetworkPolicy"]
end
subgraph L3["Layer 3 — Full IaC (Kustomize / GitOps)"]
fold["Layer 2 folded into code"]
boot["Bootstrap-from-R2 on fresh clusters"]
end
L1 --> L2 --> L3
- Layer 1 is declared in
kube.tffrom day one, so every fresh cluster comes up with storage and cert-manager ready. - Layer 2 we do by hand with
kubectlwhile learning, one resource at a time, so each piece is understood. - Layer 3 folds Layer 2 into the module's
extra-manifests(or a GitOps tool) so the whole stack — and a restore from R2 — comes up from nothing.
Your daily workflow (and why the layers help)¶
You plan to create the cluster each morning and destroy it each night.
The layer split is what makes that bearable: Layer 1 is one terraform apply,
and once Layer 2 becomes Layer 3, the entire database returns — data included —
without manual steps. Until then, the guide's Layer 2 chapters are the
repeatable "muscle memory" part.
Where to go deeper¶
- kube-hetzner README & docs
- Longhorn documentation
- Cloudflare R2 S3 API compatibility
- cert-manager installation
Next: Constraints & decisions — the sharp edges of this environment and how they shaped the build.