Swarm Cluster
Highly Available Docker Swarm Design
In the design described below, our "private cloud" platform is:
- Highly-available (can tolerate the failure of a single component)
- Scalable (can add resource or capacity as required)
- Portable (run it on your garage server today, run it in AWS tomorrow)
- Secure (access protected with LetsEncrypt certificates and optional OIDC with 2FA)
- Automated (requires minimal care and feeding)
Design Decisions
Where possible, services will be highly available.** This means that:
- At least 3 docker swarm manager nodes are required, to provide fault-tolerance of a single failure.
- Ceph is employed for share storage, because it too can be made tolerant of a single failure.
note
An exception to the 3-nodes decision is running a single-node configuration. If you only have one node, then obviously your swarm is only as resilient as that node. It's still a perfectly valid swarm configuration, ideal for starting your self-hosting journey. In fact, under the single-node configuration, you don't need ceph either, and you can simply use the local volume on your host for storage. You'll be able to migrate to ceph/more nodes if/when you expand.
Where multiple solutions to a requirement exist, preference will be given to the most portable solution.
This means that:
- Services are defined using docker-compose v3 YAML syntax
- Services are portable, meaning a particular stack could be shut down and moved to a new provider with minimal effort.
Network Flows
- HTTP (TCP 80) : Redirects to https
- HTTPS (TCP 443) : Serves individual docker containers via SSL-encrypted reverse proxy