Intro to Kubernetes What is a Pod?

Have you heard the buzz about Kubernetes in the software engineering scene? If you are anything like me, you’ve certainly heard of it, and know that high-level engineers use it to deploy scalable applications. But, much beyond that, you haven’t been able to find out. And its no wonder why. Most of the documentation and courses out there covering Kubernetes are so laden with technical jargon that they make it inaccessible. Plus, it is hard to get a cluster set up to try things on in order to learn it. I was in the same boat until I finally decided it was time to take a deep dive and learn Kubernetes. So let’s dive into the question about Kubernetes what is a Pod?

What is a Pod Cover Image

Set Up a Local Environment

In order to get started, we need to set up a cluster to try things out on. There are many ways to get started with a cluster, but I chose to use a service called K3s. K3s is an official Kubernetes distribution specialized for running on resource-restricted hardware. It can be used to create a single-node cluster, and is optimized for ARM, meaning it can be run on a raspberry pi. This was perfect for me, since I run a bunch of home lab sites on a raspberry pi with a terabyte of disk space attached. So, I decided to move all my home lab sites from docker compose to K3s in order to learn.

Setting up K3s is really simple. Start by running the installer script:

curl -sfL https://get.k3s.io | sh -

This will do a few things:

  • It will configure K3s as a service, to ensure it always starts up again after a host reboot or failure.
  • Install and configure utility scripts such as kubectl.
  • Generate a kubeconfig file at /etc/rancher/k3s/k3s.yaml, and ensure the kubectl installed previously will automatically use it.

Once you have run this script, that is all you should have to do. If you wish to add more nodes to your cluster, or want to configure an alternate datastore to the K3s default of sqlite, see the official install instructions for K3s.

To test that our setup works, go ahead and run kubectl get nodes. This will return data about all the different devices (or nodes) that make up your cluster. You should see output like this, with one entry for each node:

NAME          STATUS   ROLES                  AGE   VERSION
<node_name> Ready control-plane,master 1m v1.33.6+k3s1

Now that we have our cluster set up, we can start experimenting with running some services on Kubernetes.

What is a Pod?

In Kubernetes, a pod is the smallest unit of computing that you can utilize. Each pod holds one or more tightly coupled containers, and offers shared storage and network resources to those containers. In docker, you use compose files or commands to create individual containers. But in Kubernetes, you use commands or YAML manifests to create pods, and each pod can contain one or more containers.

Let’s take a look at a simple pod manifest to get a better idea of how pods are actually configured.

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  namespace: namespace-1
spec:
  containers:
  - name: nginx
    image: nginx:1.14.2
    volumeMounts:
      - name: volume-web
        mountPath: /usr/share/nginx/html/index.html #this overwrites the specified directory
        subPath: index.html
    ports:
      - containerPort: 80
  volumes:  #add the configmap as a volume to this pod
    - name: volume-web
      configMap:
        name: cm-index-html

This manifest creates a pod named nginx in the namespace-1 namespace. That pod contains a single container, an nginx:1.14.2 container listening on port 80 of the pod. Then, it mounts a custom index.html file to the base web directory using the ConfigMap cm-index-html. This makes a web server that serves a custom webpage when the base directory is hit with a request. To use this manifest, we first need to do two things: create the namespace-1 namespace, and create the cm-index-html ConfigMap.

To create the namespace, simply run this command:

kubectl create namespace namespace-1

Once the namespace is created, create the ConfigMap by running:

kubectl create configmap cm-index-html \
  --namespace namespace-1 \
  --from-literal=index.html='<html><body><h1>Hello from ConfigMap!</h1></body></html>'

Note that we specified that the ConfigMap should be created in the namespace-1 namespace. That is critically important. Finally, apply the web server manifest by navigating to the folder containing your manifest and running:

kubectl apply manifest.yaml

Replace manifest.yaml with the name of the file you store your pod manifest in. If all goes well, your pod should be created and run successfully. Test that with kubectl get pods -n namespace-1. To test that the ConfigMap was mounted correctly, either send a curl request to the pod on port 80, or port-forward the pod to it’s host:

kubectl port-forward pod/nginx 8080:80 -n namespace-1

Visit http://<node-ip>:8080, and if all went well, you will be greeted by the custom html content from your ConfigMap!

Within a pod, there are other flags you can set, such as resource claims, health probes, and more. Click here to learn more about pods.

Use Deployments, Not Pods

So now we have answered the “What is a Pod” question. But there are some problems that appear when we try to use pods to host productions applications. For one thing, pods can die from time to time. Pods have no way to heal themselves or restart when they go bad. So if our application pods go down, they will not come back up on their own, which could lead to widespread service outages. Additionally, pods are hard to scale and load-balance between on their own, as setting up the individual pods requires manual configuration. Also, releasing updates on apps running on pods alone requires a lot of manual work, and precludes the possibility of rolling updates.

That is where Deployments come in. Deployments in Kubernetes are an artifact that manages the provisioning of pods on your behalf. They allow the engineer to declaratively specify and configure the pods needed to run their application. Using Deployments, you declare a desired state, and the Deployment automatically creates the pods needed to get you to that desired state. Want to release an update, or create more replicas of a particular pod. No problem! Just update your deployment manifest, apply the changes, and Kubernetes will get straight to work to match your desired state with zero downtime!

Let’s take a look at a quick little demo to see what a deployment might look like. Here is a deployment manifest based on the pod we created before:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  namespace: namespace-1
  labels:
    app: nginx1
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx1
  template:
    metadata:
      labels:
        app: nginx1
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        volumeMounts:
          - name: volume-web
            mountPath: /usr/share/nginx/html/index.html #this overwrites the specified directory
            subPath: index.html
        ports:
          - containerPort: 80
        resources:
          requests:
            cpu: "100m"
            memory: "128Mi"
          limits:
            cpu: "400m"
            memory: "448Mi"
      volumes:
        - name: volume-web
          configMap:
            name: cm-index-html

In our deployment, we essentially include one or more pod manifests, and add fields to them that will help us manage deployment-wide settings, such as number of replicas. Apply it using kubectl apply manifest.yaml. After that, check your namespace-1 namespace. You should see two nginx pods coming online. Once the pods enter the ready state, you can test them just like we did above.

Let’s test the self-healing feature of deployments. Delete one of the pods using kubectl delete pod <pod_id_here> -n namespace-1. Wait a moment, then get all pods from that namespace again. You should see that the deployment automatically created a new pod to replace the one you deleted.

We can also test the rolling updates feature. If you want to do that, change the image tag in the deployment to a different version of nginx. Apply the changes, and then watch the pods in the namespace. You will see the old pods gradually being taken offline, and replaced automatically with the new pods at the latest version.

Services and Persistent Volumes

In addition to Deployments, running applications in production on Kubernetes requires a few more logical units to be configured in order to achieve the performance you are looking for. Services, Persistent Volumes (PVs), and Persistent Volume Claims (PVCs) handle the important tasks of network interfacing and persistent file storage, respectively. Let’s take a look at each one, and how to use them to host our applications!

Services

A Service is a resource that provides a single network interface to funnel traffic to and from multiple different pods running on one or more different cluster nodes. One particular benefit of services is that they decouple the network interface from the application layer. This helps Kubernetes implement self-healing, load balancing and rolling updates. They are commonly used for tasks like mapping HTTP traffic to a deployment of multiple web server pods for load balancing. Let’s take a look at how to use a service to expose our nginx deployment from above to the outside world.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  namespace: namespace-1
  labels:
    app: nginx1
spec:
  replicas: 5
  selector:
    matchLabels:
      app: nginx1
  template:
    metadata:
      labels:
        app: nginx1
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
          - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx
  namespace: namespace-1
spec:
  type: NodePort
  selector:
    app: nginx1
  ports:
    - port: 80        #port the service listens on
      targetPort: 80    #port the service forwards traffic to on the pod container
      name: http

In this manifest, there are two resources defined: a deployment, which manages five nginx pods (to demonstrate the load-balancing feature of services), and a service. Looking at the service, the first thing you will notice in the spec is the type field. Every service has a type. The most common type of service is called a ClusterIP.

A ClusterIP service is a service with an IP, hostname, and ports that is only accessible inside the cluster. Traffic can be mapped from the outside to a ClusterIP service using an externally configured loadbalancer, or, in development, using a port forwarding command.

The type of service we are going to be looking at, though, is called a NodePort service. A NodePort service is similar to a ClusterIP, but with an important additional feature. A high-numbered port (typically on the range 30000-32767) is selected and exposed on each node in the cluster. Traffic sent to this port on any of the nodes is then mapped directly to the service, which handles it just like a ClusterIP would. This provides us with a quick, easy way of exposing our containerized applications to the external world without configuring load balancers or external services.

Next, you will see a selector. This field determines which pods the service directs traffic to. In our example above, the service will load balance traffic between all pods with the app: nginx1 label set. This label is set on our pods in the deployment.

Finally, there is the ports section. The port field is where we specify which port the service listens on. Traffic should be directed to the service’s ip/hostname on this port. The targetPort field tells the service which port of the selected pods to send the traffic is receives to. So, since we are proxying to nginx, we should send traffic to port 80. Finally, this port mapping needs to be assigned a name, which is what the name field is for.

Test this manifest by using an apply command on the file. Once you are sure all the pods are running and the service has been created, we can send a request to test the deployment and service. But where do we send the request to? We need to get the port that Kubernetes assigned to the NodePort service. To do this, run kubectl get svc -n namespace-1. This will get information about the service we just created. You will see output like this if all goes well:

NAME    TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
nginx   NodePort   10.43.27.160   <none>        80:30458/TCP   50m

Look under the PORT(S) header. You will see two ports separated by a :. The first is the port defined on the service. The second, is the port assigned on each node to map traffic to the service. This is the port we should send our test request to. Run the following to test:

curl <node_ip>:<node_port>

If everything is running properly, you should receive the default nginx page as a response. If you got that response, then nice job, you have successfully configured a Kubernetes service!

PVs and PVCs

Persistent Volumes (PVs) and Persistent Volume Claims (PVCs) are the building blocks of persistent file storage in Kubernetes. Normally, pods are meant to be stateless. That means that when the are created, they get a fresh filesystem based on the image of the container, and when they are destroyed, all their files vanish without a trace. This principle is useful across the board for application development and hosting. But, production apps need persistent, stateful data that survives the death and healing of deployments. Otherwise, every time one of our pods dies, we could lose our entire database of user accounts, or all files uploaded to our app would disappear.

PVs and PVCs elegantly solve this problem by providing a durable interface to a variety of different storage mediums that can easily plug in to new pods. That way, pods maintain their stateless property, while at the same time, they have a way to ensure data survives across multiple pods. These work very similar to Volumes in docker. Let’s take a look at how we can use a persistent volume to overwrite the whole web directory on our nginx deployment.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  namespace: namespace-1
  labels:
    app: nginx1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx1
  template:
    metadata:
      labels:
        app: nginx1
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        volumeMounts:
          - name: volume-web
            mountPath: /usr/share/nginx/html
        ports:
          - containerPort: 80
        resources:
          requests:
            cpu: "100m"
            memory: "128Mi"
          limits:
            cpu: "400m"
            memory: "448Mi"
      volumes:
        - name: volume-web
          persistentVolumeClaim:
            claimName: pvc-index-html
---
# The Persistent Volume Manifest
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-index-html
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-path
  hostPath:
    path: /home/<username>/pv
    type: DirectoryOrCreate
---
# The Persistent Volume Claim Manifest
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-index-html
  namespace: namespace-1
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: local-path
  resources:
    requests:
      storage: 1Gi
  volumeName: pv-index-html

This deployment is a modified version of the ConfigMap manifest, which uses a PVC to overwrite the entire web directory of the nginx pod, instead of just the index.html file. First, you will see that the deployment manifest is almost exactly the same as it was with the ConfigMap, with the only exception being the entry in the volumes section. Instead of defining a ConfigMap their, we instead define a PerstentVolumeClaim, passing in the name of the PVC we create later in the manifest.

Further down, we have the PV definition. The PV defines a fixed sector of disk space that can be claimed by PVCs to store their data on. As you can see, we specify storage capacity, access modes, and other settings to fully define how pods can interact with the PV. Most critically, under the spec->hostPath->path key, you can see a path. This path is a path on the host node, and it points to where the files in the PV are actually stored on the host.

Finally, there is the PVC definition. This entity serves as the bridge between a pod/deployment and a PV, allowing the pod/deployment to store data on the PV. You will see that many similar configuration data to the PV definition is supplied when creating a PVC. The key field in the PVC, though, is the spec->volumeName. The value of this field is what is used in the deployment manifest to determine where persistent files should be stored.

This manifest creates an nginx pod with a persistent volume mapped to the html directory so we can see how it works. To test, begin by creating a folder pv at /home/<username>/pv, since that is the path we designated we wanted to store the persistent files at in the PV manifest. Then, in that folder, place two files: one called index.html, with content like this:

<html>
    <body>
        <h1>Hello from Persistent Volume Claim!</h1>
    </body>
</html>

And on called page.html, with content like this:

<html>
    <body>
        <h1>Hello from Page!</h1>
    </body>
</html>

Now, let’s apply this deployment, and see if everything works.

Once you have applied this manifest, and the deployment is up and running, get the name of the pod created by the deployment, and run this command:

kubectl port-forward -n namespace-1 pod/<pod_name_here> 8080:80

This will map port 80 of the pod to port 8080 of the host, perfect for testing. Then, visit http://<node_ip>:8080 in a browser or with a curl request. If all goes well, you should get the contents of the index.html file in the response:

curl <node_ip>:8080
<html><body><h1>Hello from Persistent Volume Claim!</h1></body></html>

Looks like it works! Now, let’s make sure the whole public web directory is mapped correctly by making a request for page.html. Visit http://<node_ip>:8080/page.html. The response should look like this:

curl <node_ip>:8080/page.html
<html><body><h1>Hello from Page!</h1></body></html>

If you got matching responses, congratulations! You have correctly configured a deployment featuring file persistence using PVs and PVCs!

Conclusion

Kubernetes is a critical skill in the tool belt of any software developer. It allows you to build scalability, self-healing, and containerization directly into your application without making significant changes to the application layer. But, there is a lot of jargon and new concepts that need to be understood in order to make effective use of its great strengths. This guide gave you an overview of these concepts to get you started, along with working examples to get started with. If you were able to follow and understand the examples given here, you have a strong basis to continue learning and using Kubernetes to host your containerized applications. I hope this guide has made Kubernetes a little bit less intimidating for you. Best of luck!

Subscribe to My Newsletter

📧 Sign up to receive updates when I create new content!

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *