book

Managing Cloud Native Data on Kubernetes

by Jeff Carpenter, Patrick McFadin

December 2022

Intermediate to advanced

329 pages

9h 35m

English

O'Reilly Media, Inc.

Book available

Read now

Unlock full access

Why We Wrote This BookWho Is This Book For?How to Read This BookConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
Infrastructure TypesWhat Is Cloud Native Data?More Infrastructure, More ProblemsKubernetes Leading the WayManaging Compute on KubernetesManaging Network on KubernetesManaging Storage on KubernetesCloud Native Data ComponentsLooking ForwardGetting Ready for the RevolutionAdopt an SRE MindsetEmbrace Distributed ComputingPrinciples of Cloud Native Data InfrastructureSummary
Docker, Containers, and StateManaging State in DockerBind MountsVolumesTmpfs MountsVolume DriversKubernetes Resources for Data StoragePods and VolumesPersistentVolumesPersistentVolumeClaimsStorageClassesKubernetes Storage ArchitectureFlexvolumeContainer Storage InterfaceContainer Attached StorageContainer Object Storage InterfaceSummary
The Hard WayPrerequisites for Running Data Infrastructure on KubernetesRunning MySQL on KubernetesReplicaSetsDeploymentsServicesAccessing MySQLRunning Apache Cassandra on KubernetesStatefulSetsAccessing CassandraSummary
Deploying Applications with Helm ChartsUsing Helm to Deploy MySQLHow Helm WorksLabelsServiceAccountsSecretsConfigMapsUpdating Helm ChartsUninstalling Helm ChartsUsing Helm to Deploy Apache CassandraAffinity and Anti-AffinityHelm, CI/CD, and OperationsSummary
Extending the Kubernetes Control PlaneExtending Kubernetes ClientsExtending Kubernetes Control Plane ComponentsExtending Kubernetes Worker Node ComponentsThe Operator PatternControllersCustom ResourcesOperatorsManaging MySQL in Kubernetes Using the Vitess OperatorVitess OverviewPlanetScale Vitess OperatorA Growing Ecosystem of OperatorsChoosing OperatorsBuilding OperatorsSummary
K8ssandra: Production-Ready Cassandra on KubernetesK8ssandra ArchitectureInstalling the K8ssandra OperatorCreating a K8ssandraClusterManaging Cassandra in Kubernetes with Cass OperatorEnabling Developer Productivity with Stargate APIsUnified Monitoring Infrastructure with Prometheus and GrafanaPerforming Repairs with Cassandra ReaperBacking Up and Restoring Data with Cassandra MedusaCreating a BackupRestoring from BackupDeploying Multicluster Applications in KubernetesSummary
Why a Kubernetes Native Approach Is NeededHybrid Data Access at Scale with TiDBTiDB ArchitectureDeploying TiDB in KubernetesServerless Cassandra with DataStax Astra DBWhat to Look for in a Kubernetes Native DatabaseBasic RequirementsThe Future of Kubernetes NativeSummary
Introduction to StreamingTypes of DeliveryDelivery GuaranteesFeature ScopeThe Role of Streaming in KubernetesStreaming on Kubernetes with Apache PulsarPreparing Your EnvironmentSecuring Communications by Default with cert-managerUsing Helm to Deploy Apache PulsarStream Analytics with Apache FlinkDeploying Apache Flink on KubernetesSummary

Introduction to AnalyticsDeploying Analytic Workloads in KubernetesIntroduction to Apache SparkDeploying Apache Spark in KubernetesBuild Your Custom ContainerSubmit and Run Your ApplicationKubernetes Operator for Apache SparkAlternative Schedulers for KubernetesApache YuniKornVolcanoAnalytic Engines for KubernetesDaskRaySummary
The Cloud Native AI/ML StackAI/ML DefinitionsDefining an AI/ML StackReal-Time Model Serving with KServeFull Lifecycle Feature Management with FeastVector Similarity Search with MilvusEfficient Data Movement with Apache ArrowVersioned Object Storage with lakeFSSummary
The Vision: Application-Aware PlatformsCharting Your Path to SuccessPeopleTechnologyProcessThe Future of Cloud Native DataSummary

Content preview from Managing Cloud Native Data on Kubernetes

Chapter 10. Machine Learning and Other Emerging Use Cases

In previous chapters, we covered traditional data infrastructure including databases, streaming platforms, and analytic engines with a focus on Kubernetes. Now it’s time to start looking beyond, exploring the projects and communities that are beginning to make cloud native their destination, especially concerning AI and ML.

Any time multiple arrows start pointing in the same direction, it’s worth paying attention. The directional arrows in data infrastructure all point to an overall macro trend of convergence on Kubernetes, supported by several interrelated trends:

Common stacks are emerging for managing compute-intensive AI/ML workloads, including those that leverage specific hardware such as GPUs.
Common data formats are helping to promote the efficient movement of data across compute, network, and storage resources.
Object storage is becoming a common persistence layer for data infrastructure.

In this chapter, we will look at several emerging technologies that embody these trends, the use cases they enable, and how they contribute to helping you further manage the precious resources of compute, network, and storage. We have chosen a few projects that touch on different aspects of ML and using data—this is by no means an exhaustive survey of every technology in use today. We hear directly from the engineers working on each project and provide some details on how they fit into a cloud native data stack. You are highly ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Publisher Resources

ISBN: 9781098111380Errata Page

Managing Cloud Native Data on Kubernetes

by Jeff Carpenter, Patrick McFadin

Chapter 10. Machine Learning and Other Emerging Use Cases

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

You might also like

Cloud Native DevOps with Kubernetes

Cloud Native DevOps with Kubernetes, 2nd Edition

Kubernetes Native Development: Develop, Build, Deploy, and Run Applications on Kubernetes

Managing Kubernetes

Publisher Resources

Chapter 10. Machine Learning and Other Emerging Use Cases

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,and much more.

You might also like

Cloud Native DevOps with Kubernetes

Cloud Native DevOps with Kubernetes, 2nd Edition

Kubernetes Native Development: Develop, Build, Deploy, and Run Applications on Kubernetes

Managing Kubernetes

Publisher Resources

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.