O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

AI and Analytics in Production

Book Description

If you’ve begun to deploy large-scale data systems into production, or have at least explored the process, this practical ebook shows business team leaders, business analysts, and technical developers how to make your big data analytics, machine learning, and AI initiatives production ready. Authors Ted Dunning and Ellen Friedman provide a non-technical guide to best practices for a process that can be quite challenging.

Rather than provide a complex review of tools, this ebook explores fundamental ideas on how to make your analytics production easier and more effective, based on the authors’ observations across a wide range of industries. Whether your organization is just getting started or already has data-driven applications in production, you’ll find helpful content that will help you succeed..

  • Gain an understanding of the goals, challenges, and potential pitfalls of deploying analytics and AI to production
  • Learn the best way to design, plan, and execute large data systems in production
  • Focus on the special case of machine learning and AI in production
  • Examine MapR, a data platform with the technical capabilities to support emerging trends for large-scale data
  • Explore a range of design patterns that work well for production customers across various sectors
  • Get best practices for avoiding various gotchas as you move to production

Table of Contents

  1. Preface
    1. How to Use This Book
  2. 1. Is It Production-Ready?
    1. What Does Production Really Mean?
      1. Data and Production
      2. Do You Have the Right Data and Right Question?
      3. Does Your System Fit Your Business?
      4. Scale Is More Than Just Data Volume
      5. Reliability Is a Must
      6. Predictability and Repeatability
      7. Security On-Premises, in Cloud, and Multicloud
      8. Risk Versus Potential: Pressures Change in Production
      9. Should You Separate Development from Production?
    2. Why Multitenancy Matters
    3. Simplicity Is Golden
    4. Flexibility: Are You Ready to Adapt?
    5. Formula for Success
  3. 2. Successful Habits for Production
    1. Build a Global Data Fabric
      1. Edge Computing
      2. Data Fabric Versus Data Lake
    2. Understand Why the Data Platform Matters
      1. Capabilities and Traits Required by the Data Platform
    3. Orchestrate Containers with Kubernetes
    4. Extend Applications to Clouds and Edges
    5. Use Streaming Architecture and Streaming Microservices
    6. Cultivate a Production-Ready Culture
      1. DataOps
      2. Making Room for Innovation
    7. Remember: IT Does Not Have a Magic Wand
    8. Putting It All Together: Common Questions
      1. Can You Manage End-to-End Workloads?
      2. How Do You Migrate from Test to Production?
      3. Can You Find Bottlenecks?
  4. 3. Artificial Intelligence and Machine Learning in Production
    1. What Matters Most for AI and Machine Learning in Production?
      1. Getting Real Value from AI and Machine Learning
      2. Data at Different Stages
      3. The Life Cycle of Machine Learning Models
      4. Specialized Hardware: GPUs
      5. Social and Teams
    2. Methods to Manage AI and Machine Learning Logistics
      1. Rendezvous Architecture
      2. Other Systems for Managing Machine Learning
  5. 4. Example Data Platform: MapR
    1. A First Look at MapR: Access, Global Namespace, and Multitenancy
    2. Geo-Distribution and a Global Data Fabric
    3. Implications for Streaming
    4. How This Works: Core MapR Technology
      1. Comparison with Hadoop
    5. Beyond Files: Tables, Streams, Audits, and Object Tiering
      1. MapR DB Tables
      2. Message Streams
      3. Auditing
      4. Object Tiering
  6. 5. Design Patterns
    1. Internet of Things Data Web
      1. Locking Down the Data Link
      2. Dashboards For All or For Each
    2. Data Warehouse Optimization
    3. Extending to a Data Hub
    4. Stream-Based Global Log Processing
    5. Edge Computing
    6. Customer 360
    7. Recommendation Engine
    8. Marketing Optimization
    9. Object Store
    10. Stream of Events as a System of Record
    11. Table Transformation and Percolation
  7. 6. Tips and Tricks
    1. Tip #1: Pick One Thing to Do First
    2. Tip #2: Shift Your Thinking
      1. Learn to Delay Decisions
      2. Save More Data
      3. Rethink How Your Deployment Systems Work
    3. Tip #3: Start Conservatively but Plan to Expand
    4. Tip #4 Dealing with Data in Production
    5. Tip #5: Monitor for Changes in the World and Your Data
    6. Tip #6: Be Realistic About Hardware and Network Quality
    7. Tip #7: Explore New Data Formats
    8. Tip #8: Read Our Other Books (Really!)
  8. A. Appendix
    1. Additional Resources
    2. Selected O’Reilly Publications by Ted Dunning and Ellen Friedman
    3. O’Reilly Publication by Ellen Friedman and Kostas Tzoumas