Github Repo

All the code required can be found in the github repo: github.com/edrandall-dev/kubernetes-on-ec2

Introduction

Last year, when I was learning Kubernetes, I wanted to create my own cluster on AWS using EC2 instances. The idea behind this was go through the installation of Kubernetes from start to finish, learning everything I needed to along the way.

I’ve also used Amazon’s Elastic Kubernetes Service (EKS) to deploy Neo4j using the official Neo4j Helm Charts. Info on that can be found here.

When I originally created the terraform and ansible code for this endeavour, I’ll admit that I got a bit carried away. I created a Makefile designed to handle every possible part of the process and complex shell scripts which existed alongside the ansible playbooks. If I’m honest, this extra effort did very little to support my learning and made the resulting code a lot more difficult to understand later.

As a result, I’ve decided bite the bullet and re-visit this project, listing out (and explaining) the manual steps instead of trying to abstract away the true complexity. Of course, the argument for automation has long since been fought and won but how many times a day do we really need to spin up and tear down an entire kubernetes cluster with a single command? In the end, I was just troubleshooting automation and scripting, instead of moving forward and actually doing stuff with Kubernetes. I suppose it was a bit like tidying your desk rather than getting started on that important project.

If you’re still with me, then purpose of this post should now be pretty obvious: Break down and document the steps taken to manually create a kubernetes cluster on AWS with a focus on documenting and learning.

Later, I went on to repeat this exercise on a local workstation using vagrant VMs, which I’ve documented in this post .

Pre-requisite Steps

As you’d expect, there are a number of important pre-requisites that we need to satisfy in order to deploy things onto a cloud using terraform, and configuring it remotely.

  1. Clone the git repository

This git repository contains a simplified (yes, really) version of the code which you should clone to your local machine (or development environment).

git clone https://github.com/edrandall-dev/kubernetes-on-ec2

 

  1. Install the AWS CLI and configure access

If needed, follow the instructions to download and install the AWS command line interface (CLI). Once the CLI tool is installed, the following files will need to be created in your local environment:

  • ~/.aws/config
[default]
region = us-east-1
  • ~/.aws/credentials
[default]
aws_access_key_id = AJKYYIUJJLKX72VON324KL
aws_secret_access_key = 5COOK762PASS3BABTRIDGEXSZ26KVkJKJ4FP

To create a new aws_access_key_id and aws_secret_access_key, log into your AWS account, go to the security credentials menu and create a new aws cli key pair. The keypair shown here obviously isn’t real!

NOTE: Remember that the aws_access_key_id and aws_secret_access_key give access to your AWS account. These should not be shared, or uploaded to GitHub.

Once you have obtained your aws_access_key_id and aws_secret_access_key The aws configure command will create the two files shown above. Alternatively you can create the files manually, in the format shown.

If configured correctly, a simple command like aws s3 ls should confirm that everything is working by providing a list of the s3 buckets in your AWS account (assuming, of course, that you have the IAM permissions in order to do that.)
 

  1. Create SSH key to connect to the environment

After the EC2 instances are running, we’ll need SSH keys to log into them to perform the various installation tasks.

Use the ssh-keygen command to create a new SSH keypair. In the first step, change the variable SSH_KEY_NAME to whatever you want to name your key. I use the date command to generate a timestamp, just to make it easier to see when the key was created:

SSH_KEY_NAME="key_$(date "+%Y-%m-%d_%H%M")"
ssh-keygen -N "" -q -t rsa -b 4096 -C "$SSH_KEY_NAME" -f $SSH_KEY_NAME

NOTE: We’ll also be relying on the contents of the variable SSH_KEY_NAME in later steps. Be sure to set it again manually if you end up using a different terminal session.

With the key created, you’ll need to upload it to AWS.

aws ec2 import-key-pair --key-name "$SSH_KEY_NAME" --public-key-material fileb://$SSH_KEY_NAME.pub

Check the key has been successfully uploaded to AWS, like this:

% aws ec2 describe-key-pairs
{
    "KeyFingerprint": "3d:09:3d:35:b0:8f:d1:e1:qd:3d:57:8h:aq:a9:1c:a5",
    "KeyName": "key_2023-08-15_0934",
    "KeyPairId": "key-09e05c960bb3d8e97"
}

 

  1. Install Terraform

On a mac, we can easily install terraform using homebrew, as easily as:

brew install terraform

Once installed, double-check that the command is available in your shell’s search path:

% which terraform
/opt/homebrew/bin/terraform

You can check the version using:

terraform version all

If the either of these commands gives an error, refer back to terraform’s installation instructions in order to troubleshoot.

 

  1. Install Ansible

Full installation instructions for Ansible can be found within the ansible documentation However, if you’re using a mac you’ll probably find homebrew to be easiest:

brew install ansible

 

  1. Install jq

We’ll also need jq later to parse some json output from terraform. Again, it should be easily to obtain with homebrew (on a mac):

brew install jq

 

Deploy the cloud resources in AWS

First, we need to initialise terraform:

terraform init

We are now ready to use terraform to deploy the cloud resources that we need for our environment. However, before starting the deployment, review the terraform.tfvars file and customise the variable values to suit you:

region          = "us-east-1"
base_cidr_block = "192.168.0.0/16"
creator         = "Ed Randall"

qty_k8s_cp_instances     = 1
qty_k8s_worker_instances = 4

instance_types = {
  "k8s_cp_instance"     = "t3.small",
  "k8s_worker_instance" = "t3.small"
}

env_prefix = "my-k8s-env"

 

NOTE: You should still be in the terraform-ansible directory and the the SSH_KEY_NAME variable should still contain the name of your ssh key.

If you are happy with the values inside the terraform.tfvars file, issue the following commands to start the terraform apply process:

terraform apply -auto-approve -var="public_key_path=$SSH_KEY_NAME.pub"

If needed, the environment can be torn down with:

terraform destroy -auto-approve -var="public_key_path=$SSH_KEY_NAME.pub"

Terraform will provide a lot of information about the resources being created, along with outputs which look like the following:

Apply complete! Resources: 21 added, 0 changed, 0 destroyed.

Outputs:

control_plane_public_ips = [
  "13.53.205.52",
]
worker_public_ips = [
  "16.171.161.205",
  "13.53.41.85",
  "16.171.129.198",
  "16.170.243.115",
]

This output shows us the public ipv4 addresses of the control plane node, and each of the worker nodes which have been created by terraform.

Deployment diagram

The following diagram shows which cloud resources are deployed in EC2.

Kubernetes on EC2 Layout (click to expand)

 

Install Kubernetes on EC2 using Ansible

Now that the cloud environment has been created by terraform, we need to do 2 things before we can execute the ansible playbooks:

  • Create an ansible.cfg file
  • Create an inventory file which will contain details of the EC2 instances which have been created by terraform

 

Create ansible config File

An ansible config file can be created a file by copying and pasting the following lines into a new file called ansible.cfg.

[defaults]
timeout = 60
inventory=ansible_inventory.ini
private_key_file=
host_key_checking=false
deprecation_warnings=False
remote_user=ec2-user
interpreter_python=auto_silent  

[privilege_escalation]
become=True
become_method=sudo
become_user=root

You will need to paste the key name of your private key file against the private_key_file= parameter. If the variable is still set, you should be able to view this with echo $SSH_KEY_NAME

The following command will append your keyname to the correct line in the config file.

sed -i '' "/private_key_file=/s/$/$SSH_KEY_NAME/" ansible.cfg

 

Generate ansible inventory

In order for ansible to perform actions, it needs to be given an “inventory”. This is a list of the servers (listed by IP address or hostname) that are going to managed by ansible. In order to generate this easily, I wrote a simple script called create-inventory.sh, which can be found in the scripts directory and looks like this:

#!/bin/bash

#
# Script:   create-inventory.sh
# Purpose:  This script can be used to create an inventory for ansible using
#           terraform's outputs. It can only be run after terraform has finished
#           creating the new environment in AWS.  This script also appends some
#           variables to the bottom of the file, which will be needed by ansible.
#

# Get public IPs of EC2 instances
worker_ips=($(terraform output -json worker_public_ips | jq -r '.[]'))
ctrlplane_ip=($(terraform output -json control_plane_public_ips | jq -r '.[]'))

# Get private IP of ctrl-plane instance
ctrlplane_private_ip=($(terraform output -json ctrlplane_private_ip | jq -r '.[]'))

#Get instance ids of EC2 instances
worker_ids=($(terraform output -json worker_instance_ids | jq -r '.[]'))
ctrlplane_ids=($(terraform output -json ctrl_plane_instance_ids | jq -r '.[]'))

# Get ctrlplane worker instance info 
echo "[ctrlplane_instances]"
echo "ctrlplane-instance ansible_host=${ctrlplane_ip} ansible_user=ec2-user"
echo

# Loop through instances and populate the inventory file
echo "[worker_instances]"
for i in "${!worker_ids[@]}"; do
  echo "worker-instance-${i} ansible_host=${worker_ips[i]} ansible_user=ec2-user"
done

echo
echo "[all:vars]"
echo ctrl_plane_private_ip=${ctrlplane_private_ip}
echo pod_network_cidr="10.244.0.0/16"

Generate the ansible inventory with this command:

./create-inventory.sh > ansible_inventory.ini

You will then have an ansible inventory file called ansible_inventory.ini which looks something like this:

[ctrlplane_instances]
ctrlplane-instance ansible_host=16.170.249.23 ansible_user=ec2-user

[worker_instances]
worker-instance-0 ansible_host=16.171.250.17 ansible_user=ec2-user
worker-instance-1 ansible_host=16.171.196.198 ansible_user=ec2-user
worker-instance-2 ansible_host=16.170.245.34 ansible_user=ec2-user
worker-instance-3 ansible_host=16.171.148.130 ansible_user=ec2-user

[all:vars]
ctrl_plane_private_ip=192.168.2.237
pod_network_cidr=10.244.0.0/16

 

NOTE: There are several other ways to do this, but I thought that this simple script was an easy way to generate the inventory and demonstrate what’s required.

 

Run ansible playbooks

With ansible installed and configured, the playbooks can now be executed. I have divided the configuration into 3 separate ansible playbooks which are designed to be executed in order:

  • 1-k8s-pre-flight.yaml

    Some “pre-flight” checks and tasks which are common to both control plane, and worker nodes

    • Turn off SELinux
    • Upgrade operating system packages
    • Configure package repositories for kubernetes & containerd and install them
    • Configure necessary kernel modules
    • Update the /etc/hosts file with names and IP addresses of all hosts
  • 2-k8s-cp-instance-prep.yaml

    Configuration of the control plane node

    • Use kubeadm to create the kubernetes cluster
    • Start the kubelet service
    • Create a kublet config file
    • Intall the CNI (Calico)
  • 3-k8s-worker-instance-prep.yaml

    Configuration of the worker nodes

    • Generate the join command & execute it on the worker nodes
    • Fetch the kubelet config file

The playbooks should be executed in order, with the ansible-playbook command:

ansible-playbook 1-k8s-pre-flight.yaml
ansible-playbook 2-k8s-cp-instance-prep.yaml
ansible-playbook 3-k8s-worker-instance-prep.yaml

Once completed, the kubernetes cluster should be up and running on EC2. Test with:

% kubectl get nodes
NAME           STATUS   ROLES           AGE   VERSION
ctrl-plane-1   Ready    control-plane   24m   v1.28.0
worker-1       Ready    <none>          22m   v1.28.0
worker-2       Ready    <none>          22m   v1.28.0
worker-3       Ready    <none>          22m   v1.28.0
worker-4       Ready    <none>          22m   v1.28.0

Conclusion

If the above steps are followed correctly, you should now have access to a kubernetes cluster which is running inside Amazon Web Services, using EC2 instances. The purpose of this exercise was to facilitate learning, and not to create a “production ready” environment. As such there are several design items within this setup which may not follow “best practice” architectural principles.

Outstanding items

There are a few other tweaks that I’d like to apply to this environment when I have time. They include

  • Creating a “multi-zone” environment which has worker nodes distributed across different availability zones in AWS
  • Separate security groups for the control plane and worker nodes.