Kubernetes: A Beginner's Guide to Container Orchestration

#kubernetes #guide #container #docker #microservices #minikube

Kubernetes, often abbreviated as K8s, is an open-source platform designed to automate the deployment, scaling, and operation of application containers. It has become the de facto standard for container orchestration, providing a robust and scalable framework to manage your applications. This guide will walk you through the basics of Kubernetes, explaining key terminologies and offering some core infrastructures to get you started.

But before that, let's take a journey to not a long time ago - where we relied on monolithic web applications: massive codebases that grew with new features until they became slow and unwieldy. Now, more developers, architects, and DevOps experts favor microservices over monoliths. This typically involves splitting the monolith into at least two applications: a front-end app and a back-end app (the API). After deciding on microservices, the next question is:

Where should these microservices run to ensure stability and ease of management and deployment?

The short answer: Use Docker which utilizes containers.

What is container?

Containers effectively virtualize the host operating system (or kernel) and isolate an application’s dependencies from other containers running on the same machine. Before containers, if you had multiple applications deployed on the same virtual machine (VM), any changes to shared dependencies could cause strange things to happen - so the tendency was to have one application per virtual machine.

The solution of one application per VM solved the isolation problem for conflicting dependencies, but it wasted a lot of resources (CPU and memory). This is because a VM runs not only your application but also a full operating system that needs resources too, so less would be available for your application to use.

Containers solve this problem with two pieces: a container engine and a container image, which is a package of an application and its dependencies. The container engine runs applications in containers isolating it from other applications running on the host machine. This removes the need to run a separate operating system for each application, allowing for higher resource utilization and lower costs.

What is Docker?

Docker is an open-source platform that enables developers to automate the deployment, scaling, and management of applications within lightweight, portable containers. Containers package an application with all its dependencies, libraries, and configuration files needed to run, ensuring consistency across multiple development, testing, and production environments. Docker simplifies the creation, testing, and deployment of applications by providing a consistent environment, regardless of the underlying infrastructure.

Key features of Docker include:

Portability: Containers can run on any system that supports Docker, ensuring that the application behaves the same way across different environments.
Isolation: Each container runs in its own isolated environment, which enhances security and prevents conflicts between applications.
Efficiency: Containers share the host system's OS kernel and resources, making them more efficient and lightweight compared to traditional virtual machines.
Scalability: Docker makes it easy to scale applications up or down by deploying multiple container instances as needed.
Version Control: Docker images can be versioned, enabling easy rollbacks and tracking of changes over time.
Ecosystem: Docker has a rich ecosystem of tools and services, including Docker Hub, a repository for sharing and managing container images, and Docker Compose, which simplifies the definition and management of multi-container applications.

Containers allow a developer to package up an application with all of the parts it needs, such as libraries and other dependencies, and ship it all out as one package.

But how and where should I launch containers?

There are many options for running containers: AWS Elastic Container Service (AWS Fargate or reserved instances with auto-scaling), cloud instances with predefined Docker images in Azure or Google Cloud (using templates, instance groups, and auto-scaling), on your own server with Docker, or Kubernetes. Kubernetes, created by Google's engineers in 2014, is designed specifically for virtualization and containers.

Then, what is Kubernetes?

Kubernetes is an open-source system for running and managing containers, automating deployments, scaling, creating and configuring ingresses, and deploying stateless or stateful applications. You can set up a Kubernetes cluster by launching instances and installing Kubernetes on them. Then, obtain the cluster's API endpoint, configure kubectl (a tool for managing Kubernetes clusters), and your Kubernetes setup is ready.

What Kubernetes offers:

Automated Container Deployment: Simplifies deploying applications in containers.
Scaling: Automatically adjusts the number of running containers based on demand.
Management: Centralizes control of containerized applications.
Load Balancing: Distributes network traffic evenly across containers.
Self-Healing: Restarts failed containers and replaces unresponsive nodes.
Secret Management: Securely stores and manages sensitive information.
Rollouts and Rollbacks: Manages application updates and rollbacks safely.
Support for Stateless Applications: Efficiently handles applications without persistent storage.
Support for Stateful Applications: Manages applications with persistent storage needs.

Kubernetes terminology

Pods

A Kubernetes pod is a group of containers, the smallest unit managed by Kubernetes.

Containers in a pod share the same resources such as memory and storage. And all containers within a Kubernetes pod share the same network IP address. As a result, they can communicate with each other using local host networking, and they appear as a single network endpoint to other pods and services in the cluster. This simplifies networking and ensures that containers in a pod can easily interact and share resources.

Single-container pods are common for simple applications, but for complex tasks requiring multiple processes, multi-container pods simplify deployment by sharing data volumes and resources. For instance, an image-processing service creating GIFs might have a pod with several containers: a primary container handling requests and sidecar containers managing background tasks and data cleanup, working together to optimize performance.

Deployments

Kubernetes deployments define the scale at which you want to run your application by letting you set the details of how you would like pods replicated on your Kubernetes nodes. Deployments describe the number of desired identical pod replicas to run and the preferred update strategy used when updating the deployment. Kubernetes will track pod health, and will remove or add pods as needed to bring your application deployment to the desired state.

Services

The lifetime of an individual pod cannot be relied upon - everything from their IP addresses to their very existence are prone to change. In the same vein, Kubernetes doesn’t treat its pods as unique, long-running instances; if a pod encounters an issue and dies, it’s Kubernetes’ job to replace it so that the application doesn’t experience any downtime.

A service is an abstraction over the pods, and essentially, the only interface the various application consumers interact with. As pods are replaced, their internal names and IPs might change. A service exposes a single machine name or IP address mapped to pods whose underlying names and numbers are unreliable. A service ensures that, to the outside network, everything appears to be unchanged.

Nodes

A Kubernetes node manages and runs pods - it’s the machine (whether virtualized or physical) that performs the given work. Just as pods collect individual containers that operate together, a node collects entire pods that function together. When you’re operating at scale, you want to be able to hand work over to a node whose pods are free to take it.

Under the hood of Kubernetes

Master Node - A control panel for the whole Kubernetes cluster. The components of the master can be run on any node in the cluster. The key components are:

API server: The API server exposes a REST interface to the Kubernetes cluster. All operations against pods, services, and so forth, are executed programmatically by communicating with the endpoints provided by it.
ectd: A distributed key-value store that Kubernetes uses to share information about the overall state of a cluster. Additionally, nodes can refer to the global configuration data stored there to set themselves up whenever they are regenerated.
Scheduler: The scheduler is responsible for assigning work to the various nodes. It keeps watch over the resource capacity and ensures that a worker node’s performance is within an appropriate threshold.
Controller manager: The controller-manager is responsible for making sure that the shared state of the cluster is operating as expected. More accurately, the controller manager oversees various controllers which respond to events (e.g., if a node goes down).

Worker nodes: Primary node agent, also called minion nodes. The pods are run here. Worker nodes contain all the necessary services to manage networking between the containers, communicate with the master node, and assign resources to the containers scheduled.

Docker: Runs on each worker node and downloads images and starting containers.
Kubelet: Monitors the state of a pod to ensure that all the containers are running. It provides a heartbeat message every few seconds to the control plane. If a replication controller does not receive that message, the node is marked as unhealthy. It also communicates with the data store, getting information about services and writing details about newly created ones.
Kube-proxy: A network proxy and load balancer for a service on a single worker node. It is responsible for traffic routing. The Kube proxy routes traffic coming into a node from the service. It forwards requests for work to the correct containers.
Kubectl: A CLI tool for the users to communicate with the Kubernetes API server.

Once again, this video greatly illustrates how Kubernetes works:

Fore more extensive breakdown about how Kubernetes works, you can read this post form DigitalOcean or this page from Kubernetes itself.

FAQs

What is Kubernetes used for?

Kubernetes manages container applications deployed in the cloud. It restarts orphaned containers, shuts down unused ones, and automatically provisions resources like memory, storage, and CPU as needed.

How does Kubernetes work with Docker?

Kubernetes supports several container engines, including Docker. Docker containers efficiently distribute packaged applications, and Kubernetes coordinates and schedules these applications.

How do I use Kubernetes?

To try Kubernetes, install Minikube for a local testing environment. When ready for real deployments, use kubectl to manage your application with Kubernetes.

And yes, we have another post for getting started with Minikube here:

https://www.automatedtestingwithtuyen.com/post/minikube-one-of-the-best-local-kubernetes-clusters-for-learning-and-developing

Let's give it a look and continue our journey with mastering this container orchestration !!!

AUTOMATED
TESTING

With TUYEN

Kubernetes: A Beginner's Guide to Container Orchestration

What is container?

What is Docker?

Then, what is Kubernetes?

Kubernetes terminology

Pods

Deployments

Services

Nodes

Under the hood of Kubernetes

FAQs

Recent Posts

Comments

Join my mailing list

AUTOMATED TESTING

With TUYEN

What is container?

What is Docker?

Then, what is Kubernetes?

Kubernetes terminology

Pods

Deployments

Services

Nodes

Under the hood of Kubernetes

FAQs

Comments

Join my mailing list

AUTOMATED
TESTING