Setting Up a Self-Managed Kubernetes Cluster: A Step-by-Step Guide
This is for a non HA cluster.
As Kubernetes continues to dominate the landscape of container orchestration, many developers and system administrators are opting for self-managed clusters to gain complete control over their infrastructure. In this guide, I’ll walk you through the steps required to set up a self-managed Kubernetes cluster using kubeadm.
By the end of this tutorial, you’ll have a functional Kubernetes cluster ready to handle workloads. Let’s dive in!
📝 Prerequisites
Before beginning, ensure that you have:
An Odd Number of Servers: For failure tolerance, Kubernetes clusters work best with an odd number of servers. This setup allows for a simple quorum, following the formula
(N-1)/2to determine the number of tolerated failures. For a minimal setup, you can start with 1 master node (or control plane node) and 2 worker nodes.Ubuntu 20.04 or a similar Linux distribution on your nodes.
At least 2 CPUs and 2 GB of RAM on each node (one control plane node and at least one worker node).
The control plane node should have at least 20 GB of storage while the worker nodes should have at least 10 GB of storage.
A reliable network connection
Root access or a user with sudo privileges during set up.
Step 1: Prepare the Environment
Note: Repeat Step 1 on both the control plane (master) node and each worker node. This will ensure all nodes are prepared with the necessary dependencies.
Update System Packages
Before installing Kubernetes components, let’s make sure your system packages are up to date. Run:
sudo apt update && sudo apt upgrade -yAdd Kubernetes Package Repository
Download the Kubernetes GPG key:
sudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpgAdd the Kubernetes apt repository:
echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.listUpdate package lists:
sudo apt updateFor more details, refer to the Kubernetes Installation Guide.
Step 2: Install Kubernetes Tools
Note: The first part of this step, where we install
kubeadm,kubelet, andkubectl, should also be repeated on each worker node. These tools are required on both the control plane and worker nodes.
Install kubeadm, kubelet, and kubectl
These tools are essential for managing the Kubernetes cluster:
sudo apt install -y kubeadm kubelet kubectlInstall Container Runtime (containerd)
Kubernetes requires a container runtime like containerd to run containers. Install it with:
sudo apt update && sudo apt install -y containerdConfigure CGroups for containerd and kubelet
Control groups, or cgroups, are a Linux kernel feature that manage and allocate system resources such as CPU, memory, disk I/O, and network bandwidth for groups of processes. In the context of Kubernetes, cgroups play a crucial role in ensuring that containers are efficiently managed and isolated. Here’s what cgroups do:
Resource Allocation and Limiting:
Cgroups allow Kubernetes to set limits and requests on the amount of CPU and memory a container can use. For example, you can restrict a container to use a maximum of 512 MB of memory or a limited percentage of CPU resources. This prevents any single container from consuming all resources on the node.Resource Isolation:
By isolating resources, cgroups ensure that containers and their processes run independently without interfering with each other. If one container crashes or spikes in resource usage, cgroups help prevent it from impacting other containers.Resource Monitoring:
Cgroups provide visibility into resource usage, allowing Kubernetes to monitor the CPU and memory consumption of containers and make decisions based on this data (like scaling or scheduling).Resource Prioritization:
Cgroups can be used to prioritize resources for certain containers over others, ensuring critical services receive the resources they need, even under high load.
Overall, cgroups help Kubernetes enforce resource limits, maintain stability, and optimize performance, which is essential for managing containerized applications in a multi-tenant environment.
Kubernetes components need to use consistent cgroup drivers. There are two main CGroup drivers, which are:
systemdcgroup driver
Description: This driver uses
systemdas the cgroup manager, which is the default init system on most modern Linux distributions (like Ubuntu and CentOS).Usage: Preferred for Kubernetes clusters as it aligns with the native Linux
systemdinit system, leading to better compatibility and stability in resource management.
cgroupfscgroup driver
Description: This driver uses
cgroupfsas the cgroup manager, which directly manages cgroups within the filesystem.Usage: Often used with container runtimes like Docker but may lead to issues in Kubernetes because it can conflict with
systemd. It’s generally recommended to switch tosystemdfor Kubernetes clusters, especially in production environments.
Note: Consistency between the cgroup driver used by kubelet and the container runtime (such as containerd or Docker) is essential for stable cluster operations. Using the systemd driver is typically recommended for production environments.
Let’s set them to use systemd . With this being the default, all we have to do is verify that the current init system is using systemd, if so then we do not need to do anything.
Check the current init system by running:
ps -p 1Create the configuration path for
containerd:
sudo mkdir -p /etc/containerdGenerate the default
containerdconfig:
sudo containerd config default | sudo tee /etc/containerd/config.tomlEdit the config to set
SystemdCgrouptotrue:
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.tomlVerify the setting:
cat /etc/containerd/config.toml | grep -i SystemdCgroup -B 50Restart
containerd:
sudo systemctl restart containerdSet the kubelet Cgroup Driver to cgroupfs (Optional)
This section can be skipped if you are using the default systemd driver. The whole point is to make sure we keep the container runtime and the kubelet using the same driver.
1. Set the kubelet Cgroup Driver to cgroupfs
Edit the kubelet configuration file, typically located at
/var/lib/kubelet/config.yaml:
sudo nano /var/lib/kubelet/config.yamlFind the
cgroupDriversetting and set it tocgroupfs:
cgroupDriver: cgroupfsSave and close the file.
Restart
kubeletto apply the changes:
sudo systemctl restart kubeletIf the kubelet configuration file doesn’t already exist, you can also specify the cgroup driver directly by adding the --cgroup-driver=cgroupfs flag in the kubelet service configuration file (typically at /etc/systemd/system/kubelet.service.d/10-kubeadm.conf), then restart the kubelet service.
2. Set the Container Runtime Cgroup Driver to cgroupfs
For containerd
Edit the
containerdconfiguration file located at/etc/containerd/config.toml:
sudo nano /etc/containerd/config.tomlLocate the
SystemdCgroupoption under the[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]section, and set it tofalse:
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] SystemdCgroup = falseSave and close the file.
Restart
containerdto apply the changes:
sudo systemctl restart containerdFor Docker
Create or edit the Docker daemon configuration file located at
/etc/docker/daemon.json:
sudo nano /etc/docker/daemon.jsonSet the cgroup driver to
cgroupfsin the JSON configuration:
{ "exec-opts": ["native.cgroupdriver=cgroupfs"] }Save and close the file.
Restart Docker to apply the changes:
sudo systemctl restart dockerFor more details, refer to the Container Runtime Configuration in Kubernetes documentation.
Step 3: Initialize the Control Plane Node
Disable Swap
Kubernetes requires swap to be turned off, the reason is Kubernetes manages memory resources for containers and relies on accurate memory allocation and usage data. If swap is enabled, it can interfere with Kubernetes’ ability to monitor and control resource usage accurately. This can lead to unexpected behavior, where containers exceed memory limits or become unstable, which affects overall cluster stability.
sudo swapoff -a
sudo sed -i '/ swap / s/^/#/' /etc/fstabEnable IP Forwarding
IP forwarding is a feature in the Linux kernel that allows the server to forward network packets from one network interface to another. By default, many Linux distributions disable IP forwarding, which means the system will not route packets between network interfaces.
To ensure network traffic flows correctly between different network interfaces, IP forwarding is essential in Kubernetes. This will enable pod-to-pod communication between nodes.
Why IP Forwarding is Needed in Kubernetes
Pod-to-Pod Communication Across Nodes:
Kubernetes creates a virtual network where each pod can communicate with other pods, even if they’re on different nodes. For this to work, IP packets need to be routed between different interfaces (e.g., between the pod network interface and the node’s main network interface).
Enabling IP forwarding allows each node to forward packets between these interfaces, facilitating cross-node communication in the cluster.
Service and Network Routing:
Kubernetes uses various service types (like ClusterIP, NodePort, and LoadBalancer) to expose applications to internal and external traffic.
With IP forwarding enabled, packets can be routed to their destination service or pod IP, even if that IP is on a different node within the cluster. This is crucial for Kubernetes networking components, such as kube-proxy, to correctly handle traffic routing.
Network Plugins and CNI (Container Network Interface):
Many CNI plugins (like Flannel, Calico, and Weave) rely on IP forwarding to handle the complex routing required for networking in a multi-node cluster.
Enabling IP forwarding ensures that the network plugins can set up and manage pod networking, allowing pods to communicate seamlessly across the cluster.
Enable IP forwarding to allow packets to be routed between different network interfaces:
#permanently change it
echo "net.ipv4.ip_forward = 1" | sudo tee -a /etc/sysctl.conf
#to temporarily change it
sudo sysctl -w net.ipv4.ip_forward=1 Or
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
sudo sysctl --systemInitialize the Kubernetes Cluster
Let’s initialize the cluster on the control plane node. Replace <control-plane-ip> with the actual IP of your control plane node:
sudo kubeadm init --apiserver-advertise-address=<control-plane-ip> --pod-network-cidr="10.244.0.0/16" --upload-certsRefer to the kubeadm Cluster Setup Guide for more details.
Configure kubectl for Cluster Access
Now, let’s set up kubectl so you can manage the cluster:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/configStep 4: Install Pod Network (CNI)
Install Flannel (Example CNI)
Kubernetes requires a container network interface (CNI) for pod-to-pod communication. Here, we’ll use Flannel. If you encounter issues with pods not running due to br_netfilter, load the module with:
sudo modprobe overlay | bridge #depending on the module being used.
sudo modprobe br_netfilterDownload and Configure Flannel
Kubernetes needs a Container Network Interface (CNI) for pod-to-pod communication. We’ll use Flannel as our CNI. Follow these steps to download and configure it:
Download the Flannel YAML configuration file and save it as
kube-flannel.yml:
curl -LO https://raw.githubusercontent.com/flannel-io/flannel/v0.20.2/Documentation/kube-flannel.ymlEdit the Flannel Configuration:
Open the
kube-flannel.ymlfile using a text editor:
cd kube-flannel.ymlLocate the
argssection within thekube-flannelcontainer definition. It should look like this:
args:
- --ip-masq
- --kube-subnet-mgrAdd an additional argument to specify the network interface:
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=eth0Adding
--iface=eth0ensures Flannel uses the correct network interface for pod communication.
Save and exit the file.
Deploy Flannel:
kubectl apply -f kube-flannel.yamlFor more about Flannel configuration, refer to the Flannel Documentation.
Step 5: Verify Control Plane Status
Run the following command to check if the control plane node is ready:
kubectl get nodesYou should see the control plane node in a “Ready” state if everything is set up correctly.
Step 6: Join Worker Nodes to the Cluster
Set Unique Hostnames for Worker Nodes
Each worker node should have a unique hostname. Set it with:
sudo hostnamectl set-hostname <worker-node-name>Join Worker Nodes
After initializing the control plane, kubeadm generates a command to join worker nodes. It will look similar to this:
kubeadm join <control-plane-ip>:6443 --token <token> --discovery-token-ca-cert-hash <hash>Run the command on each worker node.
Refer to the Adding Nodes with kubeadm for more information.
Troubleshoot Common Issues
If you encounter errors related to unsupported fields in the kubeadm config file, such as caCertificateValidityPeriod or certificateValidityPeriod, open the config file:
sudo nano <your-kubeadm-config-file>.yamlRemove the unsupported fields if using v1beta3.
🧪 Step 7: Run Kubernetes End-to-End (E2E) Tests
Kubernetes provides a set of E2E tests to ensure that all cluster components are working together as expected. These tests are especially useful for validating your setup, troubleshooting issues, and ensuring the cluster meets expected performance and reliability standards.
Prerequisites for E2E Tests
Install
kubectland have access to your Kubernetes cluster.Make sure the
kubeconfigfile is correctly configured, sokubectlcan interact with your cluster:
export KUBECONFIG=$HOME/.kube/configInstall Go (if it’s not already installed):
Kubernetes E2E tests require Go, as they are implemented in Go code.Clone the Kubernetes Repository:
The E2E test scripts are included in the Kubernetes GitHub repository.
git clone https://github.com/kubernetes/kubernetes.git cd kubernetesRunning Basic E2E Tests
Kubernetes includes a variety of E2E tests, from simple checks to complex scenarios. Follow these steps to run a basic test suite.
Build the E2E Test Binary:
make WHAT=test/e2e/e2e.testRun the E2E Test Suite:
The following command will run the E2E test suite against your cluster. Replace
<your-cluster-ip>with your actual API server IP.
./_output/local/bin/linux/amd64/e2e.test --host=https://<your-cluster-ip>:64433. Specify tests to run by using the --ginkgo.focus flag if you want to run specific tests. For example, to run only the tests focused on pod creation, use:
./_output/local/bin/linux/amd64/e2e.test --ginkgo.focus="\[sig-scheduling\] Pod should"4. Viewing Test Results:
The E2E tests will output results to the console, indicating success or failure for each test.
You can also export logs to a file:
./_output/local/bin/linux/amd64/e2e.test --host=https://<your-cluster-ip>:6443 > e2e_test_results.logRunning Tests with Sonobuoy (Alternative Method)
Sonobuoy is a diagnostic tool designed to run Kubernetes conformance tests and can simplify the process of running E2E tests across different Kubernetes clusters.
Download Sonobuoy: Download Sonobuoy:
curl -L https://github.com/vmware-tanzu/sonobuoy/releases/latest/download/sonobuoy_$(uname -s)_$(uname -m).tar.gz | tar xvz sudo mv sonobuoy /usr/local/binRun Sonobuoy Tests:
Run Sonobuoy to start the conformance test suite:
sonobuoy runCheck the test status:
sonobuoy statusRetrieve and View Results:
When tests are complete, download the results:
sonobuoy retrieve .Extract the tarball:
tar -xf *.tar.gzResults will be stored in a set of JSON files, providing detailed insights into which tests passed or failed.
Additional Testing Tips
Run Tests on a Dedicated Test Cluster: Running E2E tests can place a significant load on the cluster, so it’s advisable to run them in a testing or staging environment.
Customize Test Runs: For large clusters, it might be useful to limit test scope by using tags like
--ginkgo.focusor--ginkgo.skipto focus on specific test types.Regular Testing: For production-grade clusters, consider integrating E2E tests into a CI/CD pipeline to monitor cluster health continuously.
More on E2E testing is available in the [Kubernetes E2E Testing Guide](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-testing/e2e-tests)
🎉 Wrapping Up
Congratulations! You’ve successfully set up a self-managed Kubernetes cluster. From here, you can deploy applications, set up monitoring, and explore Kubernetes’ powerful orchestration capabilities.
For more complex configurations, like high availability clusters, consider diving deeper into the Kubernetes documentation.
Happy Clustering!



