Terraform script to set up a Docker Swarm on AWS
EPL-2.0 License
This is a Terraform configuration that sets up a Docker Swarm on an existing VPC with a configurable amount of managers and worker nodes. The swarm is configured to have SSH daemon access enabled by default with EC2 instance monitoring.
In the VPC there will be 2 x number of availability zones in region subnets created. Each EC2 instance will be placed in an subnet in a round-robin fashion.
There are no elastic IPs allocated in the module in order to prevent using up the elastic IP allocation for the VPC. It is up to the caller to set that up.
The aws
provider is configured in your TF file.
AWS permissions to do the following:
The examples/iam-policies
shows the policy JSONs that are used.
For earlier versions of the module, S3 Create and Access was required to store the tokens. Tags are used in the current releases to save on S3 costs. This method is has been depreacted and removed as of v6.0.0.
/16
.The examples/simple
folder shows an example of how to use this module.
The default merge rules of cloud-config is used which may yield unexpected results (see cloudconfig merge behaviours) if you are changing existing keys. To bring back the merge behaviour from 1.2 add
merge_how: "list(append)+dict(recurse_array)+str()"
Though yum update
can simply update the software, it may be required to update things that are outside such as updates to the module itself, cloud_config_extra
information or AMI updates. For this to work, you need to have at least 3 managers otherwise you'd lose raft consensus and have to rebuild the swarm from scratch.
Upgrading a 3 manager swarm needs to be done one at a time to prevent raft consensus loss.
manager0
leave the swarm by executing ssh <username>@<manager0> sudo /root/bin/leave-swarm.sh
manager0
from the command line terraform taint module.docker-swarm.aws_instance.managers[0]
manager0
from the command line terraform apply
manager0
rejoins the swarm by checking docker node ls
manager1
leave the swarm by executing ssh <username>@<manager1> sudo /root/bin/leave-swarm.sh
manager1
from the command line terraform taint module.docker-swarm.aws_instance.managers[1]
manager1
from the command line terraform apply
manager1
rejoins the swarm by checking docker node ls
manager2
leave the swarm by executing ssh <username>@<manager2> sudo /root/bin/leave-swarm.sh
manager2
from the command line terraform taint module.docker-swarm.aws_instance.managers[2]
manager2
from the command line terraform apply
manager2
rejoins the swarm by checking docker node ls
ssh <username>@<manager0> sudo /root/bin/prune-nodes.sh
A future relase of this would utilize auto-scaling for now this needs to be done manually
ssh <username>@<manager0> sudo /root/bin/rm-workers.sh <nodename[s]>
terraform taint module.docker-swarm.aws_instance.workers[#]
terraform apply
Don't use Terraform to provision your containers, just let it build the infrastructure and add the hooks to connect it to your build system.
To use a different version of Docker create a custom cloud config with
packages: - [docker, 18.03.1ce-2.amzn2] - haveged - python2-boto3 - yum-cron - ec2-instance-connect - perl-Switch - perl-DateTime - perl-Sys-Syslog - perl-LWP-Protocol-https - perl-Digest-SHA.x86_64
Add additional SSH users using sudo /root/bin/add-docker-user.sh <username> <ssh key string>
. Note this creates users in such a way that it only allows the use of docker context
In order to improve performance when using strong cryptography, haveged should be installed.
yum-cron
and haveged
can be removed from the packages
in the custom cloud config if desired.
The servers are built with ElasticSearch and Redis containers in mind and the following documents specify the changes that are implemented as part of Terraform.