GPU Kubernetes Homelab: Infrastructure-as-Code for AI Workloads
Published by O'Reilly Media, Inc.
Build a complete multinode cluster with GPU passthrough using OpenTofu and Talos Linux
What you’ll learn and how you can apply it
- Deploy Kubernetes infrastructure using infrastructure as code for reproducible multinode provisioning on a hypervisor platform
- Implement load balancing for services, dynamic DNS for home networks, automated TLS certificate management, and traffic routing
- Establish enterprise container workflows with a private registry featuring vulnerability scanning and RBAC, then deploy microservices with service discovery and ingress patterns
- Integrate GPU hardware into your cluster with scheduling constraints, deploy LLM inference services, and build applications that consume AI capabilities
- Implement comprehensive observability and modern development practices
Course description
Building a Kubernetes homelab is one of the most effective ways to master cloud native infrastructure. Unlike managed cloud services that abstract away complexity, a homelab requires you to understand every layer—from router configuration and 10-gigabit networking through Proxmox virtualization, Talos Kubernetes nodes, and GPU-accelerated AI workloads. This hands-on course guides you through the complete journey using infrastructure-as-code with OpenTofu, from initial hypervisor setup to running production-grade services like Harbor container registry, Ollama LLM inference, and comprehensive observability with Prometheus and Grafana.
In four hours, expert Jonathan Johnson will take you through the complete technology stack behind the homelab he uses to teach cloud native development at Trinity College. You’ll see how modern tools like Claude Code and Obsidian support infrastructure development, how OpenTofu makes complex deployments reproducible, and how technologies like Tailscale VPN, DuckDNS, MetalLB, and NGINX Ingress work together to provide secure external access. You’ll understand the architecture behind running Kubernetes on both Talos Linux VMs and specialized GPU worker nodes, deploying services like RabbitMQ and Harbor, and integrating everything from NFS persistent storage to Spring Boot microservices and AI model serving with Ollama.
This course is perfect for engineers who want to move beyond cloud-managed Kubernetes and gain deep infrastructure knowledge. Whether you’re preparing for Kubernetes certifications, building a platform for personal AI projects, or designing enterprise environments, you’ll leave with proven architectural patterns spanning the entire stack—from physical networking and virtualization through Kubernetes addons and AI workload orchestration.
This live event is for you because...
- You’re a DevOps engineer who wants to understand infrastructure beyond managed cloud services.
- You’re preparing for Kubernetes certifications and need hands-on practice environments.
- You’re interested in running AI/ML workloads locally and understanding GPU orchestration.
- You’re building platforms for personal projects or learning experiments.
- You’re an educator creating learning environments for students or workshop participants.
- You work in infrastructure and want to validate architectural decisions in a safe environment.
- You want control over your learning environment without cloud service limitations.
- You’re exploring the economics and practical considerations of self-hosted infrastructure.
Prerequisites
- Familiarity with core Kubernetes concepts like Pods, Deployments, Services
- Basic Linux command-line navigation and text editing skills
- An understanding of networking basics such as IP addressing, DNS, and ports
- A general awareness of containerization principles
Recommended preparation:
- No local installation required—all demonstrations will be screen-shared by the instructor
- Optional: Review the course GitHub repository to preview infrastructure-as-code patterns (link to come)
- Optional: If you’re planning your own homelab, consider your use cases, budget, and available space
- Explore infrastructure-as-code tools if you’re unfamiliar with them
Recommended follow-up:
- Read Kubernetes Up and Running, second edition (book)
- Read Terraform: Up and Running, third edition (book)
- Read NVIDIA GPU Programming (forthcoming)
- Read Building Microservices (book)
- Take Kubernetes Fundamentals in 2 Weeks (live online course with Jonathan Johnson)
- Take Kubernetes Intermediate in 2 Weeks (live online course with Jonathan Johnson)
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Introduction and course overview (10 minutes)
- Presentation: Welcome and learning objectives; hardware to AI workloads; principles over exact specifications; Why build a homelab?
- Group discussion: Your experience with Kubernetes, homelabs, and GPUs
Hardware and network foundation (20 minutes)
- Presentation: Planning infrastructure; rig hardware selection (CPU, RAM, storage, GPU considerations); router configuration for homelab environments; 10G switch architecture and bandwidth planning; power and cooling for continuous operation; budget planning
- Group discussion: What’s your intended homelab budget range?
Proxmox VE as the foundation (20 minutes)
- Presentation: Proxmox advantages (web UI, VM, and LXC container support); when to use VMs versus LXC containers; why Kubernetes nodes typically run in VMs; resource allocation strategies across workloads; storage pools and ZFS considerations
- Demonstration: Tour of Proxmox web interface; VM and LXC container overview; network bridge configuration; storage pool organization; snapshot and backup capabilities
- Q&A
- Break
Development tools and infrastructure as code (15 minutes)
- Presentation: Modern infrastructure development workflow; Claude Code for AI-assisted infrastructure coding; Obsidian for documentation and architecture notes; OpenTofu/Terraform for infrastructure as code; why IaC matters (reproducibility, version control, collaboration); Git workflow for infrastructure changes
- Group discussion: Your experience with IaC tools
- Demonstration: Infrastructure-as-code project structure; OpenTofu module organization; Proxmox provider configuration; environment and variable management
Talos Linux for Kubernetes (10 minutes)
- Presentation: Talos Linux for Kubernetes-first operating system; no SSH access—security by design; machine configuration instead of shell scripts; API-driven management; comparison with traditional Linux distributions; when you might need mixed-OS clusters (GPU compatibility)
Multinode cluster deployment (20 minutes)
- Demonstration: Deploying Talos Kubernetes cluster with OpenTofu; reviewing OpenTofu configuration for VMs; planning infrastructure changes; Proxmox creating Talos VMs; Talos bootstrap and kubeadm initialization; extracting kubeconfig credentials; verifying cluster health
- Presentation: Understanding the deployment sequence; Proxmox VM provisioning via OpenTofu; Talos machine configuration; Flannel CNI network setup; kubeadm control plane initialization; how Talos differs from a traditional cluster setup
Cluster validation and health checks (20 minutes)
- Demonstration: Verifying cluster operational status; node readiness checks; system component verification; network functionality testing; understanding pod and service networks
- Q&A
- Break
Load balancing with MetalLB (15 minutes)
- Presentation: Networking challenges in non-cloud Kubernetes; why cloud load balancers don't work on bare metal; MetalLB architecture—Layer 2 versus BGP modes; IP address pool management; integration with Kubernetes services
- Demonstration: Deploying MetalLB via OpenTofu; configuring IP address pool (e.g., 192.168.50.200-210); verifying MetalLB controller and speaker pods; watching the LoadBalancer Service get the external IP; testing connectivity to assigned IPs
External access architecture (15 minutes)
- Presentation: Secure internet access patterns; DuckDNS for dynamic DNS in home networks; router port forwarding configuration (ports 80, 443, 51820); Tailscale VPN for secure administrative access; NGINX Ingress controller for HTTP/HTTPS routing; cert-manager with Let's Encrypt for automatic TLS; VPN-only versus public service exposure decisions
- Demonstration: Deploying complete ingress stack; deploying NGINX Ingress controller; configuring cert-manager with Let’s Encrypt; deploying the test application with an ingress route; watch automatic certificate issuance; access via HTTPS from the internet
Harbor container registry (10 minutes)
- Presentation: Harbor features (RBAC, vulnerability scanning, Helm charts); comparison with Docker Hub and other registries; VPN-only access pattern for security; integration with Kubernetes image pulls; storage backend considerations
- Demonstration: Harbor in action; accessing Harbor web interface (via VPN); navigating projects and repositories; Trivy vulnerability scanning results; push/pull workflow demonstration
GPU passthrough and Ubuntu worker node (20 minutes)
- Presentation: GPU integration challenges; Proxmox GPU passthrough configuration; Talos Linux GPU compatibility considerations; when mixed-OS clusters are needed (GPU workers); NVIDIA driver requirements and installation; kubeadm for joining non-Talos nodes; GPU resource advertising via device plug-ins; node affinity and taints for GPU workloads
- Demonstration: GPU-enabled worker node; showing GPU visibility in the system; verifying GPU resources in the Kubernetes cluster; reviewing node labels and taints; GPU scheduling constraints
- Q&A
- Break
Persistent storage with NFS (10 minutes)
- Presentation: Storage architecture for Kubernetes; why stateful applications need persistent volumes; LXC containers for infrastructure services (NFS server); NFS CSI driver for Kubernetes integration; StorageClass for dynamic provisioning; PersistentVolumeClaim patterns Demonstration: NFS storage infrastructure; showing the NFS server running in an LXC container; deploying NFS CSI driver via OpenTofu; creating a test PersistentVolumeClaim; verifying volume mount in the Pod; storage on the NFS server
Ollama for LLM inference (20 minutes)
- Presentation: Running LLMs on Kubernetes with GPUs; why Ollama for local model serving; GPU affinity with node selectors and taints; resource requests and limits for GPU pods; model storage with persistent volumes; service exposure for API access Demonstration: Deploying Ollama to the GPU node; reviewing Ollama Kubernetes manifests; deploying with GPU node affinity; verifying that the AI Pod is scheduled on the GPU worker; accessing Ollama API (port-forward or ingress); testing LLM inference request; observing GPU utilization during inference; model loading and response generation
Microservice with AI integration (15 minutes)
- Presentation: Microservices calling LLM APIs; Spring Boot to Kubernetes deployment patterns; container image building and Harbor registry; service-to-service communication (Spring Boot → Ollama); Kubernetes DNS for service discovery; ingress routing for external access
- Demonstration: Deploying and testing an application; Spring Boot application code structure; deploying Spring Boot microservice to the cluster; configuring a service to call the Ollama API; accessing application via NGINX Ingress; submitting prompt through web interface; request flow (browser → Ingress → Spring Boot → Ollama → GPU); observing GPU activity during inference
Observability stack (15 minutes)
- Presentation: Complete monitoring architecture; Kubernetes Dashboard with Metrics API; Prometheus for metrics collection and storage; Grafana for visualization and alerting; NVIDIA DCGM Exporter for GPU metrics; custom dashboards for homelab monitoring
- Demonstration: Exploring observability tools; accessing Kubernetes Dashboard; navigating cluster resources and metrics; opening Grafana dashboards; viewing cluster resource usage (CPU, memory, network); GPU-specific dashboards (utilization, temperature, VRAM); Prometheus query examples; alerting configuration patterns
Wrap-up and Q&A (5 minutes)
- Presentation: Journey recap: hardware to AI workloads; key architectural patterns summary; investment considerations and ROI; certification preparation path
Your Instructor
Jonathan Johnson
With over two decades in commercial software engineering, Jonathan Johnson is driven by a passion for designing impactful software. His career began with laboratory instrument software and data management. He then ventured into personal banking, embracing object-oriented design before moving on to internet-based enterprise applications with the rise of Java. Jonathan returned to laboratory software at 454 Life Sciences and Roche Diagnostics, where he applied Java-based state machines and enterprise services to manage the vast data from DNA sequencing instruments. Later, as a hands-on architect, he implemented microservices, containers, and Kubernetes for Thermo Fisher Scientific’s laboratory management platform.
Jonathan enjoys collaborating with peers, sharing insights, and discussing approaches to modernize application architectures while upholding principles of high modularity and low coupling.