book

Kafka: The Definitive Guide, 2nd Edition

by Gwen Shapira, Todd Palino, Rajini Sivaram, Krit Petty

November 2021

Beginner to intermediate

485 pages

14h 22m

English

O'Reilly Media, Inc.

Book available

Read now

Unlock full access

Who Should Read This BookConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
Publish/Subscribe MessagingHow It StartsIndividual Queue SystemsEnter KafkaMessages and BatchesSchemasTopics and PartitionsProducers and ConsumersBrokers and ClustersMultiple ClustersWhy Kafka?Multiple ProducersMultiple ConsumersDisk-Based RetentionScalableHigh PerformancePlatform FeaturesThe Data EcosystemUse CasesKafka’s OriginLinkedIn’s ProblemThe Birth of KafkaOpen SourceCommercial EngagementThe NameGetting Started with Kafka
Environment SetupChoosing an Operating SystemInstalling JavaInstalling ZooKeeperInstalling a Kafka BrokerConfiguring the BrokerGeneral Broker ParametersTopic DefaultsSelecting HardwareDisk ThroughputDisk CapacityMemoryNetworkingCPUKafka in the CloudMicrosoft AzureAmazon Web ServicesConfiguring Kafka ClustersHow Many Brokers?Broker ConfigurationOS TuningProduction ConcernsGarbage Collector OptionsDatacenter LayoutColocating Applications on ZooKeeperSummary
Producer OverviewConstructing a Kafka ProducerSending a Message to KafkaSending a Message SynchronouslySending a Message AsynchronouslyConfiguring Producersclient.idacksMessage Delivery Timelinger.msbuffer.memorycompression.typebatch.sizemax.in.flight.requests.per.connectionmax.request.sizereceive.buffer.bytes and send.buffer.bytesenable.idempotenceSerializersCustom SerializersSerializing Using Apache AvroUsing Avro Records with KafkaPartitionsHeadersInterceptorsQuotas and ThrottlingSummary
Kafka Consumer ConceptsConsumers and Consumer GroupsConsumer Groups and Partition RebalanceStatic Group MembershipCreating a Kafka ConsumerSubscribing to TopicsThe Poll LoopThread SafetyConfiguring Consumersfetch.min.bytesfetch.max.wait.msfetch.max.bytesmax.poll.recordsmax.partition.fetch.bytessession.timeout.ms and heartbeat.interval.msmax.poll.interval.msdefault.api.timeout.msrequest.timeout.msauto.offset.resetenable.auto.commitpartition.assignment.strategyclient.idclient.rackgroup.instance.idreceive.buffer.bytes and send.buffer.bytesoffsets.retention.minutesCommits and OffsetsAutomatic CommitCommit Current OffsetAsynchronous CommitCombining Synchronous and Asynchronous CommitsCommitting a Specified OffsetRebalance ListenersConsuming Records with Specific OffsetsBut How Do We Exit?DeserializersCustom DeserializersUsing Avro Deserialization with Kafka ConsumerStandalone Consumer: Why and How to Use a Consumer Without a GroupSummary
AdminClient OverviewAsynchronous and Eventually Consistent APIOptionsFlat HierarchyAdditional NotesAdminClient Lifecycle: Creating, Configuring, and Closingclient.dns.lookuprequest.timeout.msEssential Topic ManagementConfiguration ManagementConsumer Group ManagementExploring Consumer GroupsModifying Consumer GroupsCluster MetadataAdvanced Admin OperationsAdding Partitions to a TopicDeleting Records from a TopicLeader ElectionReassigning ReplicasTestingSummary
Cluster MembershipThe ControllerKRaft: Kafka’s New Raft-Based ControllerReplicationRequest ProcessingProduce RequestsFetch RequestsOther RequestsPhysical StorageTiered StoragePartition AllocationFile ManagementFile FormatIndexesCompactionHow Compaction WorksDeleted EventsWhen Are Topics Compacted?Summary
Reliability GuaranteesReplicationBroker ConfigurationReplication FactorUnclean Leader ElectionMinimum In-Sync ReplicasKeeping Replicas In SyncPersisting to DiskUsing Producers in a Reliable SystemSend AcknowledgmentsConfiguring Producer RetriesAdditional Error HandlingUsing Consumers in a Reliable SystemImportant Consumer Configuration Properties for Reliable ProcessingExplicitly Committing Offsets in ConsumersValidating System ReliabilityValidating ConfigurationValidating ApplicationsMonitoring Reliability in ProductionSummary

Idempotent ProducerHow Does the Idempotent Producer Work?Limitations of the Idempotent ProducerHow Do I Use the Kafka Idempotent Producer?TransactionsTransactions Use CasesWhat Problems Do Transactions Solve?How Do Transactions Guarantee Exactly-Once?What Problems Aren’t Solved by Transactions?How Do I Use Transactions?Transactional IDs and FencingHow Transactions WorkPerformance of TransactionsSummary
Considerations When Building Data PipelinesTimelinessReliabilityHigh and Varying ThroughputData FormatsTransformationsSecurityFailure HandlingCoupling and AgilityWhen to Use Kafka Connect Versus Producer and ConsumerKafka ConnectRunning Kafka ConnectConnector Example: File Source and File SinkConnector Example: MySQL to ElasticsearchSingle Message TransformationsA Deeper Look at Kafka ConnectAlternatives to Kafka ConnectIngest Frameworks for Other DatastoresGUI-Based ETL ToolsStream Processing FrameworksSummary
Use Cases of Cross-Cluster MirroringMulticluster ArchitecturesSome Realities of Cross-Datacenter CommunicationHub-and-Spoke ArchitectureActive-Active ArchitectureActive-Standby ArchitectureStretch ClustersApache Kafka’s MirrorMakerConfiguring MirrorMakerMulticluster Replication TopologySecuring MirrorMakerDeploying MirrorMaker in ProductionTuning MirrorMakerOther Cross-Cluster Mirroring SolutionsUber uReplicatorLinkedIn BrooklinConfluent Cross-Datacenter Mirroring SolutionsSummary
Locking Down KafkaSecurity ProtocolsAuthenticationSSLSASLReauthenticationSecurity Updates Without DowntimeEncryptionEnd-to-End EncryptionAuthorizationAclAuthorizerCustomizing AuthorizationSecurity ConsiderationsAuditingSecuring ZooKeeperSASLSSLAuthorizationSecuring the PlatformPassword ProtectionSummary
Topic OperationsCreating a New TopicListing All Topics in a ClusterDescribing Topic DetailsAdding PartitionsReducing PartitionsDeleting a TopicConsumer GroupsList and Describe GroupsDelete GroupOffset ManagementDynamic Configuration ChangesOverriding Topic Configuration DefaultsOverriding Client and User Configuration DefaultsOverriding Broker Configuration DefaultsDescribing Configuration OverridesRemoving Configuration OverridesProducing and ConsumingConsole ProducerConsole ConsumerPartition ManagementPreferred Replica ElectionChanging a Partition’s ReplicasDumping Log SegmentsReplica VerificationOther ToolsUnsafe OperationsMoving the Cluster ControllerRemoving Topics to Be DeletedDeleting Topics ManuallySummary
Metric BasicsWhere Are the Metrics?What Metrics Do I Need?Application Health ChecksService-Level ObjectivesService-Level DefinitionsWhat Metrics Make Good SLIs?Using SLOs in AlertingKafka Broker MetricsDiagnosing Cluster ProblemsThe Art of Under-Replicated PartitionsBroker MetricsTopic and Partition MetricsJVM MonitoringOS MonitoringLoggingClient MonitoringProducer MetricsConsumer MetricsQuotasLag MonitoringEnd-to-End MonitoringSummary
What Is Stream Processing?Stream Processing ConceptsTopologyTimeStateStream-Table DualityTime WindowsProcessing GuaranteesStream Processing Design PatternsSingle-Event ProcessingProcessing with Local StateMultiphase Processing/RepartitioningProcessing with External Lookup: Stream-Table JoinTable-Table JoinStreaming JoinOut-of-Sequence EventsReprocessingInteractive QueriesKafka Streams by ExampleWord CountStock Market StatisticsClickStream EnrichmentKafka Streams: Architecture OverviewBuilding a TopologyOptimizing a TopologyTesting a TopologyScaling a TopologySurviving FailuresStream Processing Use CasesHow to Choose a Stream Processing FrameworkSummary
Installing on WindowsUsing Windows Subsystem for LinuxUsing Native JavaInstalling on macOSUsing HomebrewInstalling Manually
Comprehensive PlatformsCluster Deployment and ManagementMonitoring and Data ExplorationClient LibrariesStream Processing

Content preview from Kafka: The Definitive Guide, 2nd Edition

Chapter 7. Reliable Data Delivery

Reliability is a property of a system—not of a single component—so when we are talking about the reliability guarantees of Apache Kafka, we will need to keep the entire system and its use cases in mind. When it comes to reliability, the systems that integrate with Kafka are as important as Kafka itself. And because reliability is a system concern, it cannot be the responsibility of just one person. Everyone—Kafka administrators, Linux administrators, network and storage administrators, and the application developers—must work together to build a reliable system.

Apache Kafka is very flexible about reliable data delivery. We understand that Kafka has many use cases, from tracking clicks in a website to credit card payments. Some of the use cases require utmost reliability, while others prioritize speed and simplicity over reliability. Kafka was written to be configurable enough, and its client API flexible enough, to allow all kinds of reliability trade-offs.

Because of its flexibility, it is also easy to accidentally shoot ourselves in the foot when using Kafka—believing that our system is reliable when in fact it is not. In this chapter, we will start by talking about different kinds of reliability and what they mean in the context of Apache Kafka. Then we will talk about Kafka’s replication mechanism and how it contributes to the reliability of the system. We will then discuss Kafka’s brokers and topics and how they should be configured for ...