Apache Pulsar in Action

Book description

Apache Pulsar in Action teaches you to build scalable streaming messaging systems using Pulsar. You’ll start with a rapid introduction to enterprise messaging and discover the unique benefits of Pulsar. Following crystal-clear explanations and engaging examples, you’ll use the Pulsar Functions framework to develop a microservices-based application. Real-world case studies illustrate how to implement the most important messaging design patterns.

Table of contents

  1. inside front cover
  2. Apache Pulsar in Action
  3. Copyright
  4. dedication
  5. contents
  6. front matter
    1. foreword
    2. preface
    3. acknowledgments
    4. about this book
    5. Who should read this book
    6. How this book is organized: A roadmap
    7. About the code
    8. Other online resources
    9. liveBook discussion forum
    10. about the author
    11. about the cover illustration
  7. Part 1 Getting started with Apache Pulsar
  8. 1 Introduction to Apache Pulsar
    1. 1.1 Enterprise messaging systems
      1. 1.1.1 Key capabilities
    2. 1.2 Message consumption patterns
      1. 1.2.1 Publish-subscribe messaging
      2. 1.2.2 Message queuing
    3. 1.3 The evolution of messaging systems
      1. 1.3.1 Generic messaging systems
      2. 1.3.2 Message-oriented middleware
      3. 1.3.3 Enterprise service bus
      4. 1.3.4 Distributed messaging systems
    4. 1.4 Comparison to Apache Kafka
      1. 1.4.1 Multilayered architecture
      2. 1.4.2 Message consumption
      3. 1.4.3 Data durability
      4. 1.4.4 Message acknowledgment
      5. 1.4.5 Message retention
    5. 1.5 Why do I need Pulsar?
      1. 1.5.1 Guaranteed message delivery
      2. 1.5.2 Infinite scalability
      3. 1.5.3 Resilient to failure
      4. 1.5.4 Support for millions of topics
      5. 1.5.5 Geo-replication and active failover
    6. 1.6 Real-world use cases
      1. 1.6.1 Unified messaging systems
      2. 1.6.2 Microservices platforms
      3. 1.6.3 Connected cars
      4. 1.6.4 Fraud detection
    7. Additional resources
    8. Summary
  9. 2 Pulsar concepts and architecture
    1. 2.1 Pulsar’s physical architecture
      1. 2.1.1 Pulsar’s layered architecture
      2. 2.1.2 Stateless serving layer
      3. 2.1.3 Stream storage layer
      4. 2.1.4 Metadata storage
    2. 2.2 Pulsar’s logical architecture
      1. 2.2.1 Tenants, namespaces, and topics
      2. 2.2.2 Addressing topics in Pulsar
      3. 2.2.3 Producers, consumers, and subscriptions
      4. 2.2.4 Subscription types
    3. 2.3 Message retention and expiration
      1. 2.3.1 Data retention
      2. 2.3.2 Backlog quotas
      3. 2.3.3 Message expiration
      4. 2.3.4 Message backlog vs. message expiration
    4. 2.4 Tiered storage
    5. Summary
  10. 3 Interacting with Pulsar
    1. 3.1 Getting started with Pulsar
    2. 3.2 Administering Pulsar
      1. 3.2.1 Creating a tenant, namespace, and topic
      2. 3.2.2 Java Admin API
    3. 3.3 Pulsar clients
      1. 3.3.1 The Pulsar Java client
      2. 3.3.2 The Pulsar Python client
      3. 3.3.3 The Pulsar Go client
    4. 3.4 Advanced administration
      1. 3.4.1 Persistent topic metrics
      2. 3.4.2 Message inspection
    5. Summary
  11. Part 2 Apache Pulsar development essentials
  12. 4 Pulsar functions
    1. 4.1 Stream processing
      1. 4.1.1 Traditional batching
      2. 4.1.2 Micro-batching
      3. 4.1.3 Stream native processing
    2. 4.2 What is Pulsar Functions?
      1. 4.2.1. Programming model
    3. 4.3 Developing Pulsar functions
      1. 4.3.1 Language native functions
      2. 4.3.2 The Pulsar SDK
      3. 4.3.3 Stateful functions
    4. 4.4 Testing Pulsar functions
      1. 4.4.1 Unit testing
      2. 4.4.2 Integration testing
    5. 4.5 Deploying Pulsar functions
      1. 4.5.1 Generating a deployment artifact
      2. 4.5.2 Function configuration
      3. 4.5.3 Function deployment
      4. 4.5.4 The function deployment life cycle
      5. 4.5.5 Deployment modes
      6. 4.5.6 Pulsar function data flow
    6. Summary
  13. 5 Pulsar IO connectors
    1. 5.1 What are Pulsar IO connectors?
      1. 5.1.1 Sink connectors
      2. 5.1.2 Source connectors
      3. 5.1.3 PushSource connectors
    2. 5.2 Developing Pulsar IO connectors
      1. 5.2.1 Developing a sink connector
      2. 5.2.2 Developing a PushSource connector
    3. 5.3 Testing Pulsar IO connectors
      1. 5.3.1 Unit testing
      2. 5.3.2 Integration testing
      3. 5.3.3 Packaging Pulsar IO connectors
    4. 5.4 Deploying Pulsar IO connectors
      1. 5.4.1 Creating and deleting connectors
      2. 5.4.2 Debugging deployed connectors
    5. 5.5 Pulsar’s built-in connectors
      1. 5.5.1 Launching the MongoDB cluster
      2. 5.5.2 Link the Pulsar and MongoDB containers
      3. 5.5.3 Configure and create the MongoDB sink
    6. 5.6 Administering Pulsar IO connectors
      1. 5.6.1 Listing connectors
      2. 5.6.2 Monitoring connectors
    7. Summary
  14. 6 Pulsar security
    1. 6.1 Transport encryption
    2. 6.2 Authentication
      1. 6.2.1 TLS authentication
      2. 6.2.2 JSON Web Token authentication
    3. 6.3 Authorization
      1. 6.3.1 Roles
      2. 6.3.2 An example scenario
    4. 6.4 Message encryption
    5. Summary
  15. 7 Schema registry
    1. 7.1 Microservice communication
      1. 7.1.1 Microservice APIs
      2. 7.1.2 The need for a schema registry
    2. 7.2 The Pulsar schema registry
      1. 7.2.1 Architecture
      2. 7.2.2 Schema versioning
      3. 7.2.3 Schema compatibility
      4. 7.2.4 Schema compatibility check strategies
    3. 7.3 Using the schema registry
      1. 7.3.1 Modelling the food order event in Avro
      2. 7.3.2 Producing food order events
      3. 7.3.3 Consuming the food order events
      4. 7.3.4 Complete example
    4. 7.4 Evolving the schema
    5. Summary
  16. Part 3 Hands-on application development with Apache Pulsar
  17. 8 Pulsar Functions patterns
    1. 8.1 Data pipelines
      1. 8.1.1 Procedural programming
      2. 8.1.2 DataFlow programming
    2. 8.2 Message routing patterns
      1. 8.2.1 Splitter pattern
      2. 8.2.2 Dynamic router pattern
      3. 8.2.3 Content-based router pattern
    3. 8.3 Message transformation patterns
      1. 8.3.1 Message translator pattern
      2. 8.3.2 Content enricher pattern
      3. 8.3.3 Content filter pattern
    4. Summary
  18. 9 Resiliency patterns
    1. 9.1 Pulsar Functions resiliency
      1. 9.1.1 Adverse events
      2. 9.1.2 Fault detection
    2. 9.2 Resiliency design patterns
      1. 9.2.1 Retry pattern
      2. 9.2.2 Circuit breaker pattern
      3. 9.2.3 Rate limiter pattern
      4. 9.2.4 Time limiter pattern
      5. 9.2.5 Cache pattern
      6. 9.2.6 Fallback pattern
      7. 9.2.7 Credential refresh pattern
    3. 9.3 Multiple layers of resiliency
    4. Summary
  19. 10 Data access
    1. 10.1 Data sources
    2. 10.2 Data access use cases
      1. 10.2.1 Device validation
      2. 10.2.2 Driver location data
    3. Summary
  20. 11 Machine learning in Pulsar
    1. 11.1 Deploying ML models
      1. 11.1.1 Batch processing
      2. 11.1.2 Near real-time
    2. 11.2 Near real-time model deployment
    3. 11.3 Feature vectors
      1. 11.3.1 Feature stores
      2. 11.3.2 Feature calculation
    4. 11.4 Delivery time estimation
      1. 11.4.1 ML model export
      2. 11.4.2 Feature vector mapping
      3. 11.4.3 Model deployment
    5. 11.5 Neural nets
      1. 11.5.1 Neural net training
      2. 11.5.2 Neural net deployment in Java
    6. Summary
  21. 12 Edge analytics
    1. 12.1 IIoT architecture
      1. 12.1.1 The perception and reaction layer
      2. 12.1.2 The transportation layer
      3. 12.1.3 The data processing layer
    2. 12.2 A Pulsar-based processing layer
    3. 12.3 Edge analytics
      1. 12.3.1 Telemetric data
      2. 12.3.2 Univariate and multivariate
    4. 12.4 Univariate analysis
      1. 12.4.1 Noise reduction
      2. 12.4.2 Statistical analysis
      3. 12.4.3 Approximation
    5. 12.5 Multivariate analysis
      1. 12.5.1 Creating a bidirectional messaging mesh
      2. 12.5.2 Multivariate dataset construction
    6. 12.6 Beyond the book
    7. Summary
  22. Appendix A. Running Pulsar on Kubernetes
    1. A.1 Create a Kubernetes cluster
      1. A.1.1 Install prerequisites
      2. A.1.2 Minikube
    2. A.2 The Pulsar Helm chart
      1. A.2.1 What is Helm?
      2. A.2.2 The Pulsar Helm chart
    3. A.3 Using the Pulsar Helm chart
      1. A.3.1 Administering Pulsar on Kubernetes
      2. A.3.2 Configuring clients
  23. Appendix B. Geo-replication
    1. B.1 Synchronous geo-replication
    2. B.2 Asynchronous geo-replication
      1. B.2.1 Configuring asynchronous geo-replication
    3. B.3 Asynchronous geo-replication patterns
      1. B.3.1 Multi-active geo-replication
      2. B.3.2 Active-standby geo-replication
      3. B.3.3 Aggregation geo-replication
  24. index
  25. inside back cover

Product information

  • Title: Apache Pulsar in Action
  • Author(s): David Kjerrumgaard
  • Release date: December 2021
  • Publisher(s): Manning Publications
  • ISBN: 9781617296888