Docker sq
Docker and Kubernetes Container Virtualization - An Introduction to Kubernetes

Visualizing Kubernetes


Visualizing how all of Kubernetes' moving parts work together to deliver applications is half the battle. There's quite the stack of infrastructure layers and it's easy to get lost. So I think I'll begin from the outside and work my way in. first, though, keep in mind that Kubernetes production deployments run on multiple host computers. Things will make more sense in that context.

Our first stop, then, will be the cluster. A cluster will generally consist of multiple physical servers linked and centrally controlled so that their resources can be efficiently coordinated in the service of your application. Or, in other words, a cluster is a bunch of computers. That's it.

Each of those computers is known as a Node. They're what we called Hosts in Docker. To allow the cluster to function, each Kubernetes node has management services installed. Those include a kubelet, which is an agent configured to negotiate between containers on the node and the cluster management layer. There will also be a container runtime that manages container behavior. Historically, that runtime has actually been Docker, but there's now a shift underway towards the use of the Kubernetes-native tool called the Container Runtime Interface (CRI) and containerd in place of the Docker runtime. There will also be a kube proxy that'll handle networking - which, by the way, works through network ports.

At least one of these nodes will be a Control Plane - otherwise referred to as a Master or Head Node. Control Planes are the brains directing everything that happens within a cluster. Master nodes will include an apiserver for processing API requests, a cluster store whose job it is to persist the cluster state and configuration, a controller manager, and a scheduler.

There are two other important things to know about the control planes: One, for enterprise deployments, you'll always have at least three master nodes to protect you against failure. Why three and not two? Because cluster computing workloads operate based on information provided by a majority of voting nodes. If there's an even number of masters, you could theoretically end up with a tie vote, and that wouldn't end well.

The second thing to know is that master nodes - or, for that matter, worker nodes - will be invisible when you run your Kubernetes workloads on cloud platforms like Amazon Web Services or Microsoft Azure. That's because cloud platforms like Amazon's Elastic Container Service are managed, which means you're only expected to define your application environment and upload your code. The underlying infrastructure is out of your control.

An important Kubernetes infrastructure element is the Pod. Each Kubernetes node will contain at least one pod, and each pod will host at least one container. The key thing to understand about pods is that they're actually mini self-contained compute environments whose job it is to serve the needs of the containers they host. That means that, besides its containers, a pod will contain all the tools those containers might need to communicate and share resources with each other, and to connect where necessary with the outside world. Authentication secrets and shared storage volumes can also be part of a pod.

The control plane will only talk with a pod, exchanging instructions and status information, and never directly with a container.

You don't normally need to worry about the precise shape all those elements will take. All you have to do is provide one or more YAML files with the declarative definition that describes what you want to happen, and Kubernetes will build out all the bits and pieces for you.

That's a decent map of the neighborhood, working from out to in. There are a few more pieces of the puzzle we'll need to complete the picture. Network connections, TLS termination, and routing are handled by the API object called an Ingress. An Ingress controller can be configured to better fit your specific needs.

kubectl is the most common command line tool available for Kubernetes management. You can use kubectl on Linux, Windows and macOS machines and you'll find its basic operation isn't all that different from the docker command we've been using. Of course, kubectl's sub-commands will get more complicated the deeper you go, since they need to accommodate all the Kubernetes-specific tools they serve.

The alternative kubeadm tool provides a lightweight alternative to kubectl. kubeadm is recommended, for instance, as a fast-path for local testing Kubernetes workloads. kubeadm doesn't not permit access to the Kubernetes dashboard, monitoring, or certain add-ons.

Then there's Helm. Helm is the Kubernetes version of the APT or YUM package managers, or PIP for Python. Helm lets you identify a repository that hosts packages - or "charts" - and then install them locally. Helm handles dependencies, updates, and rollbacks for you.

Finally, I did mention the Kubernetes dashboard a bit earlier. So here it is. Getting in running involves nothing more than running the enable dashboard command. But if, like me, you've got Kubernetes running on a remote and headless machine, opening the interface in your browser can be tricky. I went with an SSH tunnel and a generated token. But fortunately there'll be plenty of documentation available online for whatever approach you choose.

The dashboard is rich in status reporting. You can drill down, for instance, into metrics describing a particular node. That'll show you all kinds of troubleshooting goodness about things like resource usage and pod allocation. Speaking of pods, the pod tab gives you similar access to individual pods.

Here you get visibility into fairly deep configuration settings. Feel free to poke around on your own infrastructure.