AI/ML notes

k8s_random_notes

Kubernetes random notes

Kubernetes is designed to abstract infrastructure, so your apps don’t care if they’re running on:

  • Bare-metal physical machines
  • VMs
  • Cloud-managed Kubernetes services (like GKE, EKS, AKS)

Kubernetes thinks in Pods and Nodes, not physical hosts vs virtual machines.

Core Concepts

πŸ–₯️ Cluster

  • A collection of machines (nodes) managed by Kubernetes.
  • Includes the control plane and worker nodes.

🧱 Node

  • A physical or virtual machine in the cluster.
  • Runs:
    • kubelet
    • container runtime (e.g. containerd)
    • kube-proxy for networking

πŸ“¦ Pod

  • The smallest deployable unit in Kubernetes.
  • Wraps one or more containers sharing:
    • Network namespace (same IP)
    • Storage volumes (if defined)

πŸ” ReplicaSet

  • Maintains a stable set of replica Pods.
  • Ensures a specified number of Pods are running at all times.

πŸš€ Deployment

  • Manages ReplicaSets and enables rolling updates, rollback, and scaling.
  • Declarative management of Pods.

πŸ”Ή Grouping, Configuration & Networking

πŸ—‚οΈ Namespace

  • Logical partitions in a cluster.
  • Useful for multi-tenancy and organizing resources (e.g. dev, staging, prod).

πŸ”§ ConfigMap

  • Stores non-sensitive config data (like app settings, file names, flags).

πŸ” Secret

  • Stores sensitive data (like passwords, tokens, keys).
  • Base64-encoded by default.

πŸ“‘ Service

  • A stable network abstraction to expose Pods.
  • Types:
    • ClusterIP (default, internal only)
    • NodePort (exposes on a node’s IP/port)
    • LoadBalancer (cloud providers)
    • Headless (DNS without a ClusterIP)

🌐 Ingress

  • Exposes HTTP/HTTPS traffic externally.
  • Allows routing based on domain or path.

πŸ”Ή Storage & Scheduling

πŸ“ Volume

  • Temporary or persistent storage for a Pod.
  • Example types: emptyDir, hostPath, persistentVolumeClaim, nfs, cloud storage.

πŸ“¦ PersistentVolume (PV) / PersistentVolumeClaim (PVC)

  • PV: A piece of storage in the cluster.
  • PVC: A request for storage by a user.
  • Decouples how storage is provisioned and how it’s consumed.

πŸ“… Scheduler

  • Decides which node a Pod should run on based on resources, affinity, taints, etc.

πŸ”ΉAdvanced Concepts

🧬 Custom Resource (CR)

  • A user-defined API object.
  • Lets you create types like ModelTrainingJob, DatabaseCluster, etc.

🧾 CustomResourceDefinition (CRD)

  • The schema used to define a new Custom Resource type.
  • Once defined, K8s treats your resource like built-in ones.

🧠 Controller

  • Watches resources (like Pods, Deployments, or your CRs) and reconciles desired vs actual state.
  • Keeps the system in the correct state.
  • Runs a control loop: "observe β†’ compare β†’ act".

πŸ§‘β€πŸ”§ Operator

  • Combines:
    • CRDs (define your custom API)
    • Controllers (logic to manage CRs)
  • Encapsulates domain-specific logic for managing complex apps (e.g. PostgreSQL, Kafka).
  • Operators act like K8s-native automation scripts.

πŸ”Ή Ecosystem & Add-Ons

Concept Description
πŸ”„ Job / CronJob For one-time or scheduled tasks
πŸ›‘οΈ Role / ClusterRole Permissions for resources
πŸ‘€ ServiceAccount Identity for processes in Pods
🌐 NetworkPolicy Rules about how Pods communicate
☒️ Taints & Tolerations Control Pod placement
🎯 Affinity / Anti-Affinity Influence Pod scheduling (e.g. spread across zones)
🧡 InitContainers Run before the main container starts
πŸ“‘ DaemonSet Run a Pod on every node
πŸŒ€ StatefulSet Like Deployment but with stable identities (e.g. DBs)
πŸ§ͺ Sidecar Container Helper containers in a Pod (e.g. logging, proxies)
πŸ“Š Metrics Server Collects CPU/mem usage for autoscaling
πŸ“ˆ HorizontalPodAutoscaler Auto-scales Pods based on metrics
πŸ—οΈ Helm A package manager for Kubernetes apps
βš™οΈ Kustomize Native Kubernetes YAML customization tool
πŸ“¦ Admission Controllers Intercepts API requests (validating/mutating)
πŸ§ͺ Etcd The backend key-value store for Kubernetes cluster state

🧠 How These All Relate (Conceptual Diagram)

[Kubernetes Cluster]
β”‚
β”œβ”€β”€ Namespaces
β”‚   └── Groups resources logically
β”‚
β”œβ”€β”€ Core Resources
β”‚   β”œβ”€β”€ Pod ← Smallest compute unit
β”‚   β”œβ”€β”€ Deployment ← Manages ReplicaSets & updates
β”‚   β”œβ”€β”€ ReplicaSet ← Maintains # of Pods
β”‚   β”œβ”€β”€ Service ← Exposes Pods
β”‚   β”œβ”€β”€ Ingress ← Routes traffic to Services
β”‚   └── ConfigMap / Secret ← Inject config/env into Pods
β”‚
β”œβ”€β”€ Storage Resources
β”‚   └── Volume / PersistentVolumeClaim
β”‚
β”œβ”€β”€ Custom Resources (CRs)
β”‚   └── Defined by: CustomResourceDefinition (CRD)
β”‚       └── Watched by: Custom Controller (e.g. Operator)
β”‚
└── Control Plane
    β”œβ”€β”€ API Server ← Everything talks to it
    β”œβ”€β”€ Scheduler ← Assigns Pods to nodes
    └── Controller Manager ← Reconciliation loops

🐳 What is a Container Runtime?

  • A container runtime is the low-level software responsible for running containers on a host machine.
  • A container runtime is what Kubernetes uses to turn your container images into running processes inside Pods. It handles image pulling, process isolation, resource limits, and lifecycle control.
  • In a Kubernetes context, it’s what Node’s kubelet uses to create, start, stop, and delete containers.

What a container runtime like containerd does:

Task Description
🧱 Image Handling Pulls container images (e.g. from Docker Hub or private registry)
πŸ”¨ Container Lifecycle Creates, starts, stops, and deletes containers
🧍 Namespace Setup Sets up isolated process, network, mount namespaces (Linux features)
🧠 Resource Limits Applies cgroups (CPU/mem limits) to containers
πŸ”„ Networking Hook Connects containers to virtual networks (via CNI plugin)
πŸ“¦ Storage Mounting Mounts volumes and file systems into the container
πŸ“‘ Communication Exposes an API for Kubernetes (kubelet) to control containers

πŸ”§ Common Container Runtimes

Runtime Notes
containerd Fast, lightweight. Official CNCF project. Widely used (even by Docker)
Docker Originally the full stack (runtime + tools). Now replaced in K8s by containerd
CRI-O Lightweight runtime specifically for Kubernetes (follows CRI)
runc Low-level OCI runtime used by containerd and Docker to run containers
gVisor, Kata Containers More secure/isolated runtimes (sandboxed)

🧩 How It Fits Into Kubernetes

[ Kubernetes Node ]
β”œβ”€β”€ kubelet
β”‚    └── talks to container runtime via CRI (Container Runtime Interface)
β”œβ”€β”€ containerd / CRI-O
β”‚    └── talks to runc to actually launch the container
└── container
     └── Runs your app inside isolated namespaces with limits and network rules

Kubernetes uses the CRI (Container Runtime Interface) to abstract away which runtime it uses β€” so you can plug in different runtimes (containerd, CRI-O, etc.).


πŸ’» What does a single physical node contain?

A physical machine may run:

  • One or more virtual machines (if using virtualization like KVM, VMware, etc.)
  • Or it might run Kubernetes directly (bare-metal setup)

Each Node in Kubernetes is typically a:

VM (most common in cloud setups)
or
Physical host (bare-metal Kubernetes)

On each Node, you will find:

  • OS + container runtime (like containerd)
  • kubelet (talks to control plane)
  • kube-proxy (networking rules)
  • Your scheduled Pods, each containing 1 or more containers

πŸ“¦ Do we need VMs if we have containers?

Not necessarily!

Scenario Do you need VMs?
Cloud Kubernetes (e.g. EKS, GKE) Yes (your nodes are VMs in EC2 or GCE)
Bare-metal Kubernetes No (K8s runs directly on physical machines)
Local dev (e.g. kind/minikube) Usually inside a local VM or Docker container

So: VMs and containers solve different problems:

  • VMs: Isolate full OS environments
  • Containers: Isolate application-level processes, much lighter

Kubernetes runs on VMs or physical nodesβ€”it doesn't care. What it cares about is abstracting compute capacity.


🧊 Can a container belong to different Pods?

πŸ›‘ No.

  • Each container belongs to exactly one Pod.
  • A Pod is the unit of deployment and resource allocation.
  • Containers in different Pods are totally independent (even if they run the same image).

🚫 Can one Pod span multiple nodes (machines)?

πŸ›‘ No.

  • A Pod is always scheduled onto exactly one Node.
  • All containers inside that Pod live on the same machine.
  • This is by design for tight coupling (e.g., sidecars, shared volumes, local communication)

So if you need to scale across nodes:

  • Use multiple Pods, not one big multi-node Pod.
  • Let a Service abstract across those Pods.

🧱 6. What’s the real boundary then?

Concept Physical Resource? Virtual / Logical Resource?
Physical Machine βœ… Yes ❌ No
Virtual Machine βœ… Yes βœ… Created via hypervisor
Node (in K8s) βœ… Yes (bare metal or VM) ❌ Not virtual, it's real
Pod ❌ No βœ… Logical unit running on one Node
Container ❌ No βœ… Logical unit inside Pod
Namespace / Service / Deployment / VolumeClaim ❌ No βœ… Purely logical

🧠 TL;DR Mental Model

[Physical Hardware]
  └── [Optional: VMs]
        └── [Kubernetes Node]
              └── [Pods]
                    └── [Containers]

βœ… Each Pod = runs on one Node only
βœ… Each Node = can run many Pods
βœ… Containers = belong to only one Pod
βœ… Physical vs virtual = blurred on purpose to keep K8s cloud-agnostic


🏒 What is Multi-Tenancy in Kubernetes?

Multi-tenancy = multiple teams, users, or applications sharing the same Kubernetes cluster, but:

  • Isolated from each other
  • With controlled access

Think of it like:

One office building with many companies β†’ they share the building (cluster) but each has their own space (namespace), keys (permissions), and policies.


🧩 Why use multi-tenancy?

  • Avoid running separate clusters for each team/app
  • Lower cost, centralized infrastructure
  • But still provide logical and secure isolation

πŸ—‚οΈ What are Namespaces?

Namespaces are virtual clusters within a Kubernetes cluster.

Each namespace:

  • Contains its own Pods, Deployments, Services, Secrets, ConfigMaps, etc.
  • Has its own quotas and policies
  • Can be locked down to specific users/teams

Think of it as:

β€œThis is the dev team’s sandbox.”
β€œThis is production. Only ops can touch it.”


πŸ” What is RBAC (Role-Based Access Control)?

RBAC = The way Kubernetes manages user permissions.

It defines who can do what in which namespace or cluster.

πŸ”‘ Core Components:

Component What it means
Role Defines permissions (e.g., "can read pods", "can create secrets") within a namespace
ClusterRole Same as Role, but at cluster scope (or shared across namespaces)
RoleBinding Assigns a Role to a user/service account in a namespace
ClusterRoleBinding Same as RoleBinding, but for ClusterRole (across namespaces)

πŸ“¦ Example:

Let’s say you have a dev team and prod team:

Namespaces:
- dev
- prod

Users:
- alice
- bob

Roles:
- dev-reader (can view stuff in dev)
- prod-admin (can manage everything in prod)

RoleBindings:
- dev-reader is bound to alice in dev
- prod-admin is bound to bob in prod

Result:

  • βœ… Alice can view Pods in dev
  • ❌ Alice cannot do anything in prod
  • βœ… Bob can manage everything in prod
  • ❌ Bob cannot access dev (unless separately bound)

πŸ” Multi-Tenancy = Namespaces + RBAC

Part What it does
Namespaces Isolate workloads and configs
RBAC Control who can access and operate on those workloads

Together they create logical and security boundaries for safe multi-tenancy.