Tuesday, March 30, 2021

Single-node K8 Cluster from Scratch (Centos)

Docker is the new way of deploying your application. Since more and more are
using it, there are orchestration tools that were published to manage it.
Some of them are docker swarm, cattle, and kubernetes.

On this post, we will setup Centos 7.3 VM running in virtualbox that will act as
the master and node to understand the basics on how each pieces of kubernetes
work. In production scenarios, your node must be a different machine from your
master. We will do a 1 node setup for simplicity.

We will do this from scratch without using the official RPM installers from
kubernetes and will use the latest tarball versions for kubernetes which is
v1.9.2 and v3.2.6 for Etcd as the time of this writing.

This assumes that you have a basic understanding on how docker container works.

Before proceeding, here is a summary of versions used in this tutorial.

Host OS: Ubuntu 17.04 (Zesty)
 |_ Virtualization: VirtualBox 5.2.4 r119785 (Qt5.7.1)
      |_ Virtual Machine OS: CentOS Linux release 7.3.1611 (Core)
           |_ Kubernetes: 1.9.2
           |_ ETCD: 3.2.6
           |_ Docker: 1.12.6

1. First, download the kubernetes and etcd tarballs. Kubernetes is the
orchestration tool and Etcd is the key-value store database where we will store
the whole information on the cluster.

[root@vm01 ~]# wget https://dl.k8s.io/v1.9.2/kubernetes-server-linux-amd64.tar.gz
<output truncated>
[root@vm01 ~]# wget https://github.com/coreos/etcd/releases/download/v3.2.6/etcd-v3.2.6-linux-amd64.tar.gz
<output truncated>
[root@vm01 ~]#

2. Disable swap on your machine. Kubernetes don't want swap because of
performance reasons - swap is slow and running containers in memory are faster.

[root@vm01 ~]# swapoff -a
[root@vm01 ~]# cp /etc/fstab /etc/fstab.orig
[root@vm01 ~]# sed -i 's/.*swap.*//g' /etc/fstab
[root@vm01 ~]# cat /etc/fstab

#
# /etc/fstab
# Created by anaconda on Fri Aug  4 14:30:07 2017
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/cl-root     /                       xfs     defaults        0 0
UUID=bfc05119-977b-4f3f-a260-5e548d5cdd88 /boot                   xfs     defaults        0 0

[root@vm01 ~]#

3. Since our VM will act also as the node. We must install docker on it. In a
multi-node setup, installing docker on the master is not required.

[root@vm01 ~]# yum install -y docker
<output truncated>
[root@vm01 ~]# systemctl enable --now docker
Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service.
[root@vm01 ~]#

4. Unpack the Kubernetes tarball. This includes all binary files needed to run
and manage the whole cluster.

[root@vm01 ~]# tar xvf kubernetes-server-linux-amd64.tar.gz
<output truncated>
[root@vm01 ~]# cp kubernetes/server/bin/* /usr/local/bin/

5. Unpack the etcd tarball. Etcd is the main area where we will store all
information about our cluster. That includes configuration and network settings.

[root@vm01 ~]# tar xvf etcd-v3.2.6-linux-amd64.tar.gz
<output truncated>
[root@vm01 ~]# cp etcd-v3.2.6-linux-amd64/etcd* /usr/local/bin/
[root@vm01 ~]#

6. Start "Etcd". As per my understanding on etcd's help page, the listen client
urls is the url where etcd will listen for client traffic while the advertise
url is the one that will be exposed to clients. Actually I'm still confused
with this hehe. You can verify the status of etcd using `etcdctl` command.

[root@vm01 ~]# etcd --listen-client-urls http://0.0.0.0:2379 --advertise-client-urls http://localhost:2379 &> /tmp/etcd.log &
[root@vm01 ~]# etcdctl cluster-health
member 8e9e05c52164694d is healthy: got healthy result from http://192.168.1.111:2379
cluster is healthy
[root@vm01 ~]#

7. In order to talk to etcd, we need to launch the "Kubernetes Apiserver". This
is the only thing we can use to retrieve and put information to the database.
Wait for few seconds for the apiserver to startup or tail the logs. Once
started, you can use curl to verify if you can talk to the api. The default api
port is 8080.

[root@vm01 ~]# kube-apiserver --etcd-servers=http://localhost:2379 --service-cluster-ip-range=10.0.0.0/16 --bind-address=0.0.0.0 --insecure-bind-address=0.0.0.0 &> /tmp/apiserver.log &
[root@vm01 ~]# curl http://localhost:8080/api/
{
  "kind": "APIVersions",
  "versions": [
    "v1"
  ],
  "serverAddressByClientCIDRs": [
    {
      "clientCIDR": "0.0.0.0/0",
      "serverAddress": "192.168.1.111:6443"
    }
  ]
}[root@vm01 ~]#

8. Launch "Kubelet" and create manifest directory. This process is the one that
interacts with docker daemon and api server. It also watches for pods to
create by looking inside the manifest location. Prior to that, we also need to
specify a kubelet configuration in yaml format to point kubelet to the url of
our api server.

[root@vm01 ~]# mkdir /tmp/manifests
[root@vm01 ~]# mkdir -p /var/lib/kubelet
[root@vm01 ~]# cat << EOF > /var/lib/kubelet/kubeconfig
apiVersion: v1
kind: Config
clusters:
- name: local
  cluster:
    server: http://localhost:8080
users:
- name: kubelet
contexts:
- context:
    cluster: local
    user: kubelet
  name: kubelet-context
current-context: kubelet-context
EOF
[root@vm01 ~]# kubelet --kubeconfig /var/lib/kubelet/kubeconfig --require-kubeconfig --pod-manifest-path /tmp/manifests --cgroup-driver=systemd --kubelet-cgroups=/systemd/system.slice --runtime-cgroups=/etc/systemd/system.slice &> /tmp/kubelet.log &
[root@vm01 ~]#

9. Now that we have a kubelet running, we can create a manifest and let kubelet
pick it up and create a pod.

[root@vm01 ~]# cat << EOF > /tmp/manifests/nginx-pod.yml
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  containers:
  - name: nginx
    image: nginx
    ports:
    - containerPort: 80
EOF
[root@vm01 ~]#
[root@vm01 ~]#
[root@vm01 ~]# kubectl get pods
NAME         READY     STATUS              RESTARTS   AGE
nginx-vm01   0/1       ContainerCreating   0          3m
[root@vm01 ~]#

There is a minor issue that I encounter here - the pod stays on that status
and I found out in the kubelet.log that there is an issue pulling the image from
google's registry (gcr.io).

[root@vm01 ~]# tail -f /tmp/kubelet.log
I0120 20:42:20.668948    3813 kubelet.go:1767] Starting kubelet main sync loop.
I0120 20:42:20.668994    3813 kubelet.go:1778] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 2562047h47m16.854775807s ago; threshold is 3m0s]
I0120 20:42:20.669202    3813 server.go:129] Starting to listen on 0.0.0.0:10250
I0120 20:42:20.670390    3813 server.go:299] Adding debug handlers to kubelet server.
F0120 20:42:20.671438    3813 server.go:141] listen tcp 0.0.0.0:10250: bind: address already in use
E0120 20:42:33.715870    3643 kube_docker_client.go:341] Cancel pulling image "gcr.io/google_containers/pause-amd64:3.0" because of no progress for 1m0s, latest progress: "Trying to pull repository gcr.io/google_containers/pause-amd64 ... "
E0120 20:42:33.717152    3643 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed pulling image "gcr.io/google_containers/pause-amd64:3.0": context canceled
E0120 20:42:33.717288    3643 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "nginx-vm01_default(9960993506b2e8bf46ae1eb7b1da0edf)" failed: rpc error: code = Unknown desc = failed pulling image "gcr.io/google_containers/pause-amd64:3.0": context canceled
E0120 20:42:33.717333    3643 kuberuntime_manager.go:647] createPodSandbox for pod "nginx-vm01_default(9960993506b2e8bf46ae1eb7b1da0edf)" failed: rpc error: code = Unknown desc = failed pulling image "gcr.io/google_containers/pause-amd64:3.0": context canceled
E0120 20:42:33.718919    3643 pod_workers.go:186] Error syncing pod 9960993506b2e8bf46ae1eb7b1da0edf ("nginx-vm01_default(9960993506b2e8bf46ae1eb7b1da0edf)"), skipping: failed to "CreatePodSandbox" for "nginx-vm01_default(9960993506b2e8bf46ae1eb7b1da0edf)" with CreatePodSandboxError: "CreatePodSandbox for pod \"nginx-vm01_default(9960993506b2e8bf46ae1eb7b1da0edf)\" failed: rpc error: code = Unknown desc = failed pulling image \"gcr.io/google_containers/pause-amd64:3.0\": context canceled"
[root@vm01 ~]#

So I tried puling it myself and it works fine.

[root@vm01 ~]# docker pull gcr.io/google_containers/pause-amd64:3.0
Trying to pull repository gcr.io/google_containers/pause-amd64 ...
3.0: Pulling from gcr.io/google_containers/pause-amd64
a3ed95caeb02: Pull complete
f11233434377: Pull complete
Digest: sha256:163ac025575b775d1c0f9bf0bdd0f086883171eb475b5068e7defa4ca9e76516
[root@vm01 ~]#

Then after few minutes, the error on the logs no longer appear and the pod was
successfully created. So as a workaroung, we need to prepull that image. I
haven't seen any issue similar to this in the internet.

[root@vm01 ~]# docker ps
CONTAINER ID        IMAGE                                                                                     COMMAND                  CREATED             STATUS              PORTS               NAMES
f0439466926c        docker.io/nginx@sha256:285b49d42c703fdf257d1e2422765c4ba9d3e37768d6ea83d7fe2043dad6e63d   "nginx -g 'daemon off"   41 minutes ago      Up 41 minutes                           k8s_nginx_nginx-vm01_default_9960993506b2e8bf46ae1eb7b1da0edf_0
e053d46f5b68        gcr.io/google_containers/pause-amd64:3.0                                                  "/pause"                 46 minutes ago      Up 46 minutes                           k8s_POD_nginx-vm01_default_9960993506b2e8bf46ae1eb7b1da0edf_0
[root@vm01 ~]#
[root@vm01 ~]# kubectl get pods -o wide
NAME         READY     STATUS    RESTARTS   AGE       IP           NODE
nginx-vm01   1/1       Running   0          58m       172.17.0.2   vm01
[root@vm01 ~]#

You can notice that there are 2 running containers that were created from the
manifest we dropped. The first one is for nginx itself and the other is for the
pause container. "pause" container is the one that provides IP address to the
containers. This is the infrastrucure container that is created first when
creating a pod.

To verify that the created pod is working, you should be able to see its content
from inside the node/master using the IP.

[root@vm01 ~]# curl http://172.17.0.2
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
[root@vm01 ~]#

10. Start "Kubernetes Scheduler". It is responsible for assigning pods to nodes.

[root@vm01 ~]# kube-scheduler --master=http://localhost:8080 &> /tmp/kube-scheduler.log &
[root@vm01 ~]#

10. Start "Kubernetes Controller Manager". It is responsible for managing
"Replica Sets" and "Replication Controllers". This is also required so we can
create deployments. Deployments are the rules that defined how to start pods and
how many replicas needs to be started. If deployment is created and you deleted
some replicas, those will be recreated based from deployment's rules.

[root@vm01 ~]# kube-controller-manager --master=http://localhost:8080 &> /tmp/kube-controller-manager.log &
[root@vm01 ~]# cat << EOF > nginx-deployment.yml
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 3
  template:
    metadata:
      labels:
        run: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
EOF
[root@vm01 ~]#
[root@vm01 ~]# kubectl create -f nginx-deployment.yml
deployment "nginx" created
[root@vm01 ~]#
[root@vm01 ~]# kubectl get deployments
NAME      DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
nginx     3         3         3            3           2m
[root@vm01 ~]#

At this point, we have 3 nginx pods that where created by our deployment and 1
nginx pod that is created via the manifest we dropped.

[root@vm01 ~]# kubectl get pods -o wide
NAME                     READY     STATUS    RESTARTS   AGE       IP           NODE
nginx-7587c6fdb6-7c2xk   1/1       Running   0          5m        172.17.0.5   vm01
nginx-7587c6fdb6-d467f   1/1       Running   0          5m        172.17.0.3   vm01
nginx-7587c6fdb6-ls2f4   1/1       Running   0          5m        172.17.0.4   vm01
nginx-vm01               1/1       Running   0          1h        172.17.0.2   vm01
[root@vm01 ~]#

The correct way in producing pods is via deployment because that will provide
the self healing mechanism of kubernetes. We just created a pod manually for
demonstration purposes. If we delete the "nginx-vm01" pod, it will not be
recovered whereas deleting the "nginx-XXXX" pods will be recreated.

11. Start "Kubernetes Proxy". This enables us to create a "Kubernetes Service"
which will allow our pods to be accesible outside the cluster. Once the proxy
is started, let's create a simple service from a yaml file.

[root@vm01 ~]# kube-proxy --master=http://localhost:8080 &> /tmp/kube-proxy.log &
[root@vm01 ~]# cat << EOF > nginx-svc.yml
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    run: nginx
spec:
  type: NodePort
  ports:
  - name: http
    port: 80
    nodePort: 30073
  selector:
    run: nginx
EOF
[root@vm01 ~]# kubectl create -f nginx-svc.yml
service "nginx" created
[root@vm01 ~]#

We can now see a the nginx service mapping port 80 (from the container) to port
30073 (to the node).

[root@vm01 ~]# kubectl get services
NAME         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
kubernetes   ClusterIP   10.0.0.1       <none>        443/TCP        1h
nginx        NodePort    10.0.207.211   <none>        80:30073/TCP   2s
[root@vm01 ~]#

We should be able to access the pod outside the cluster now.



Notice that we didn't create systemd unit files for the services for simplicity.
On future posts, I will include systemd unit files so we can run the services
on boot.

So that wraps up our very simple setup. On future posts, we will see how does
networking in kubernetes comes into play when multiple nodes are talking to
each other. That needs a type of SDN (Software Defined Network) like CNI,
Flannel, or Weaver.

No comments:

Post a Comment