Saturday, March 27, 2021

Multi-node K8 Cluster from Scratch (Centos)

In this post, we will setup a multi-node kubernetes cluster from scratch.
This means that we will minimize the use of RPMs to install our software so that
we can understand the minimum required to run the cluster and to know what's
happening under the hood.

Here the summary of versions we will use in this post.

Host OS: Ubuntu 17.04 (Zesty)
 |_ Virtualization: VirtualBox 5.2.4 r119785 (Qt5.7.1)
      |_ Virtual Machine OS: CentOS Linux release 7.3.1611 (Core)
           |_ Kubernetes: 1.9.2
           |_ ETCD: 3.2.6
           |_ Docker: 1.12.6

Requirements
============

1. Disable swap on master and nodes. Kubernetes doesn't want it for performance
reasons.
[root@master ~]# swapoff -a
[root@master ~]# cp /etc/fstab /etc/fstab.orig
[root@master ~]# sed -i 's/.*swap.*//g' /etc/fstab

2. Make sure master and nodes can ping and resolve each other's hostnames.

3. Disable firewall on master and nodes. Without this, pod IPs will not be
reachable on other nodes even if after setting up flanneld.
[root@master ~]# systemctl disable --now firewalld
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
Removed symlink /etc/systemd/system/basic.target.wants/firewalld.service.
[root@master ~]# systemctl mask firewalld
Created symlink from /etc/systemd/system/firewalld.service to /dev/null.
[root@master ~]#

Setup master
============

1. Download the kubernetes and etcd tarballs.
[root@master ~]# wget https://dl.k8s.io/v1.9.2/kubernetes-server-linux-amd64.tar.gz
[...]
[root@master ~]# wget https://github.com/coreos/etcd/releases/download/v3.2.6/etcd-v3.2.6-linux-amd64.tar.gz
[...]
[root@master ~]#

2. Setup ETCD - this will store all information about our cluster.
[root@master ~]# tar xvf etcd-v3.2.6-linux-amd64.tar.gz
[...]
[root@master ~]# cp etcd-v3.2.6-linux-amd64/etcd* /usr/local/bin/
[root@master ~]# # create the systemd file below
[root@master ~]# cat /etc/systemd/system/etcd.service
[Unit]
Description=ETCD server
After=network.target

[Service]
Type=notify
ExecStart=/usr/local/bin/etcd \
  --data-dir /var/lib/etcd \
  --listen-client-urls http://0.0.0.0:2379 \
  --advertise-client-urls http://master:2379

[Install]
WantedBy=multi-user.target
[root@master ~]# systemctl daemon-reload
[root@master ~]# systemctl enable --now etcd
Created symlink from /etc/systemd/system/multi-user.target.wants/etcd.service to /etc/systemd/system/etcd.service.
[root@master ~]# etcdctl cluster-health
member 8e9e05c52164694d is healthy: got healthy result from http://master:2379
cluster is healthy
[root@master ~]#

3. Setup apiserver - we will use this to communicate to ETCD and the rest of
the cluster.
[root@master ~]# tar xvf kubernetes-server-linux-amd64.tar.gz
[...]
[root@master ~]# cp kubernetes/server/bin/* /usr/local/bin/
[root@master ~]# # create the systemd file below
[root@master ~]# cat /etc/systemd/system/kube-apiserver.service
[Unit]
Description=Kube API Server
After=network.target

[Service]
Type=notify
ExecStart=/usr/local/bin/kube-apiserver --etcd-servers=http://localhost:2379 \
  --service-cluster-ip-range=10.0.0.0/16 \
  --bind-address=0.0.0.0 \
  --insecure-bind-address=0.0.0.0

[Install]
WantedBy=multi-user.target
[root@master ~]#
[root@master ~]# systemctl daemon-reload
[root@master ~]# systemctl enable --now kube-apiserver
[root@master ~]#                                                                                     

Verify that the api server is working. You must be able to reach it from any
node.
[root@master ~]# curl http://master:8080/api
{
  "kind": "APIVersions",
  "versions": [
    "v1"
  ],
  "serverAddressByClientCIDRs": [
    {
      "clientCIDR": "0.0.0.0/0",
      "serverAddress": "192.168.1.111:6443"
    }
  ]
}[root@master ~]#

4. Setup scheduler - this will take care of assigning pods to nodes.
[root@master ~]# cat /etc/systemd/system/kube-scheduler.service
[Unit]
Description=Kube Scheduler
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/kube-scheduler --master=http://localhost:8080

[Install]
WantedBy=multi-user.target
[root@master ~]#
[root@master ~]# systemctl enable --now kube-scheduler
Created symlink from /etc/systemd/system/multi-user.target.wants/kube-scheduler.service to /etc/systemd/system/kube-scheduler.service.
[root@master ~]#

5. Setup controller-manager - this will allow us to create deployments.
[root@master ~]# # create the systemd file below
[root@master ~]# cat /etc/systemd/system/kube-controller-manager.service
[Unit]
Description=Kube Controller Manager
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/kube-controller-manager --master=http://localhost:8080

[Install]
WantedBy=multi-user.target
[root@master ~]# systemctl daemon-reload
[root@master ~]# systemctl enable --now kube-controller-manager
Created symlink from /etc/systemd/system/multi-user.target.wants/kube-controller-manager.service to /etc/systemd/system/kube-controller-manager.service.
[root@master ~]#

6. Create network settings for flannel.
[root@master ~]# etcdctl mkdir /centos/network
[root@master ~]# etcdctl mk /centos/network/config "{ \"Network\": \"172.30.0.0/16\", \"SubnetLen\": 24, \"Backend\": { \"Type\": \"vxlan\" } }"
{ "Network": "172.30.0.0/16", "SubnetLen": 24, "Backend": { "Type": "vxlan" } }
[root@master ~]# 

Setup the nodes
===============

Steps below must be executed on all nodes unless specified otherwise.

1. Download and unpack kubernetes tarball.
[root@node1 ~]# wget https://dl.k8s.io/v1.9.2/kubernetes-server-linux-amd64.tar.gz
[...]
[root@node1 ~]#
[root@node1 ~]# tar xvf kubernetes-server-linux-amd64.tar.gz
kubernetes/
kubernetes/LICENSES
[...]
kubernetes/kubernetes-src.tar.gz
[root@node1 ~]#
[root@node1 ~]# cp -v kubernetes/server/bin/* /usr/local/bin/
‘kubernetes/server/bin/apiextensions-apiserver’ -> ‘/usr/local/bin/apiextensions-apiserver’
[...]
[root@node1 ~]#

2. Install and enable docker but don't start it. Flanneld will take care of it
later.
[root@node1 ~]# yum install -y docker
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirror.pregi.net
 * extras: mirror.pregi.net
 * updates: mirror.pregi.net
Resolving Dependencies
--> Running transaction check
---> Package docker.x86_64 2:1.12.6-68.gitec8512b.el7.centos will be installed
[...]
Complete!
[root@node1 ~]# systemctl enable docker
Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service.
[root@node1 ~]#

3. Setup kubelet - this is responsible for joining the nodes to master and
communicates to ETCD via API server.
[root@node1 ~]# mkdir -p /etc/kubernetes/manifests
[root@node1 ~]# mkdir -p /var/lib/kubelet
[root@node1 ~]# cat << EOF > /etc/kubernetes/kubeconfig
apiVersion: v1
kind: Config
clusters:
- name: centos
  cluster:
    server: http://master:8080
users:
- name: kubelet
contexts:
- context:
    cluster: centos
    user: kubelet
  name: kubelet-context
current-context: kubelet-context
EOF
[root@node1 ~]# # create the systemd file below
[root@node1 ~]# cat /etc/systemd/system/kubelet.service
[Unit]
Description=Kubelet
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/kubelet --kubeconfig /etc/kubernetes/kubeconfig \
  --require-kubeconfig \
  --pod-manifest-path /etc/kubernetes/manifests \
  --cgroup-driver=systemd \
  --kubelet-cgroups=/systemd/system.slice \
  --runtime-cgroups=/etc/systemd/system.slice

[Install]
WantedBy=multi-user.target
[root@node1 ~]# systemctl daemon-reload                                                                                                                                                                 
[root@node1 ~]# systemctl enable --now kubelet                                                                                                                                                           
Created symlink from /etc/systemd/system/multi-user.target.wants/kubelet.service to /etc/systemd/system/kubelet.service.                                                                             
[root@node1 ~]#

Verify that the nodes were successully registered by going to the master and get
the node list. All nodes must appear.
[root@master ~]# kubectl get nodes -o wide
NAME      STATUS    ROLES     AGE       VERSION   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION          CONTAINER-RUNTIME
node1     Ready     <none>    2m        v1.9.2    <none>        CentOS Linux 7 (Core)   3.10.0-514.el7.x86_64   docker://1.12.6
node2     Ready     <none>    2m        v1.9.2    <none>        CentOS Linux 7 (Core)   3.10.0-514.el7.x86_64   docker://1.12.6
[root@master ~]#

4. Setup up kube-proxy - this allows us to expose our pods outside the cluster
via a service.
[root@node1 ~]# # create the systemd file below
[root@node1 ~]# cat /etc/systemd/system/kube-proxy.service
[Unit]
Description=Kube proxy
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/kube-proxy --master=http://master:8080

[Install]
WantedBy=multi-user.target
[root@node1 ~]# vi /etc/systemd/system/kube-proxy.service
[root@node1 ~]# systemctl daemon-reload
[root@node1 ~]# systemctl enable --now kube-proxy
Created symlink from /etc/systemd/system/multi-user.target.wants/kube-proxy.service to /etc/systemd/system/kube-proxy.service.
[root@node1 ~]#

5. Install flannel - this will allow inter-prod communication between the nodes.
[root@node1 ~]# yum install -y flannel
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirror.pregi.net
 * extras: mirror.pregi.net
 * updates: mirror.pregi.net
Resolving Dependencies
--> Running transaction check
---> Package flannel.x86_64 0:0.7.1-2.el7 will be installed
[...]
Complete!
[root@node1 ~]#
[root@node1 ~]# cp /etc/sysconfig/flanneld /etc/sysconfig/flanneld.orig
[root@node1 ~]# cat << EOF > /etc/sysconfig/flanneld
FLANNEL_ETCD_ENDPOINTS="http://master:2379"
FLANNEL_ETCD_PREFIX="/centos/network"
EOF
[root@node1 ~]#
[root@node1 ~]# systemctl enable --now flanneld
Created symlink from /etc/systemd/system/multi-user.target.wants/flanneld.service to /usr/lib/systemd/system/flanneld.service.
Created symlink from /etc/systemd/system/docker.service.requires/flanneld.service to /usr/lib/systemd/system/flanneld.service.
[root@node1 ~]#

6. Manually pull pause image from gcr.io. For some reasons, deployments can't
run without doing this manual step.
[root@node1 ~]# docker pull gcr.io/google_containers/pause-amd64:3.0
Trying to pull repository gcr.io/google_containers/pause-amd64 ...
3.0: Pulling from gcr.io/google_containers/pause-amd64
a3ed95caeb02: Pull complete
f11233434377: Pull complete
Digest: sha256:163ac025575b775d1c0f9bf0bdd0f086883171eb475b5068e7defa4ca9e76516
[root@node1 ~]#

Let's test our cluster
======================

1. Let's check if our nodes are ready.
[root@master ~]# kubectl get nodes -o wide
NAME      STATUS    ROLES     AGE       VERSION   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION          CONTAINER-RUNTIME
node1     Ready     <none>    1h        v1.9.2    <none>        CentOS Linux 7 (Core)   3.10.0-514.el7.x86_64   docker://1.12.6
node2     Ready     <none>    1h        v1.9.2    <none>        CentOS Linux 7 (Core)   3.10.0-514.el7.x86_64   docker://1.12.6
[root@master ~]#

2. Create a simple nginx deployment. Containers must be created and pods must
have IPs assigned. The pod IPs must be reachable from any node. If you want it
to be reachable also from the master, install flanneld.
[root@master ~]# kubectl run nginx --image=nginx --replicas=2
deployment "nginx" created
[root@master ~]#
[root@master ~]# kubectl get pods -o wide
NAME                   READY     STATUS    RESTARTS   AGE       IP            NODE
nginx-8586cf59-gknr7   1/1       Running   0          46m       172.30.2.2    node2
nginx-8586cf59-pzjh4   1/1       Running   0          46m       172.30.19.2   node1
[root@master ~]#

3. Expose the deployment via NodePort service. You should be able to access the
static web page on any node.
[root@master ~]# kubectl expose deploy/nginx --port=80 --type=NodePort
service "nginx" exposed
[root@master ~]#
[root@master ~]# kubectl get svc
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)        AGE
kubernetes   ClusterIP   10.0.0.1     <none>        443/TCP        2h
nginx        NodePort    10.0.174.8   <none>        80:32122/TCP   5s
[root@master ~]#
[root@master ~]# curl http://node2:32122
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
[root@master ~]#

With all these manual steps, I prepared an automated way to provision this
using ansible. You may visit my playbook hosted in gitlab.

No comments:

Post a Comment