If you’ve begun to deploy large-scale data systems into production, or have at least explored the process, this practical ebook shows business team leaders, business analysts, and technical developers how to make your big data analytics, machine learning, and AI initiatives production ready. Authors Ted Dunning and Ellen Friedman provide a non-technical guide to best practices for a process that can be quite challenging.
Rather than provide a complex review of tools, this ebook explores fundamental ideas on how to make your analytics production easier and more effective, based on the authors’ observations across a wide range of industries. Whether your organization is just getting started or already has data-driven applications in production, you’ll find helpful content that will help you succeed..
- Gain an understanding of the goals, challenges, and potential pitfalls of deploying analytics and AI to production
- Learn the best way to design, plan, and execute large data systems in production
- Focus on the special case of machine learning and AI in production
- Examine MapR, a data platform with the technical capabilities to support emerging trends for large-scale data
- Explore a range of design patterns that work well for production customers across various sectors
- Get best practices for avoiding various gotchas as you move to production
Table of contents
1. Is It Production-Ready?
What Does Production Really Mean?
- Data and Production
- Do You Have the Right Data and Right Question?
- Does Your System Fit Your Business?
- Scale Is More Than Just Data Volume
- Reliability Is a Must
- Predictability and Repeatability
- Security On-Premises, in Cloud, and Multicloud
- Risk Versus Potential: Pressures Change in Production
- Should You Separate Development from Production?
- Why Multitenancy Matters
- Simplicity Is Golden
- Flexibility: Are You Ready to Adapt?
- Formula for Success
- What Does Production Really Mean?
2. Successful Habits for Production
- Build a Global Data Fabric
- Understand Why the Data Platform Matters
- Orchestrate Containers with Kubernetes
- Extend Applications to Clouds and Edges
- Use Streaming Architecture and Streaming Microservices
- Cultivate a Production-Ready Culture
- Remember: IT Does Not Have a Magic Wand
- Putting It All Together: Common Questions
3. Artificial Intelligence and Machine Learning in Production
- What Matters Most for AI and Machine Learning in Production?
- Methods to Manage AI and Machine Learning Logistics
4. Example Data Platform: MapR
- A First Look at MapR: Access, Global Namespace, and Multitenancy
- Geo-Distribution and a Global Data Fabric
- Implications for Streaming
- How This Works: Core MapR Technology
- Beyond Files: Tables, Streams, Audits, and Object Tiering
5. Design Patterns
- Internet of Things Data Web
- Data Warehouse Optimization
- Extending to a Data Hub
- Stream-Based Global Log Processing
- Edge Computing
- Customer 360
- Recommendation Engine
- Marketing Optimization
- Object Store
- Stream of Events as a System of Record
- Table Transformation and Percolation
6. Tips and Tricks
- Tip #1: Pick One Thing to Do First
- Tip #2: Shift Your Thinking
- Tip #3: Start Conservatively but Plan to Expand
- Tip #4 Dealing with Data in Production
- Tip #5: Monitor for Changes in the World and Your Data
- Tip #6: Be Realistic About Hardware and Network Quality
- Tip #7: Explore New Data Formats
- Tip #8: Read Our Other Books (Really!)
- A. Appendix
- Title: AI and Analytics in Production
- Release date: October 2018
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781492044109
You might also like
Practical Statistics for Data Scientists, 2nd Edition
Statistical methods are a key part of data science, yet few data scientists have formal statistical …
Machine Learning in Production: Developing and Optimizing Data Science Workflows and Applications, First Edition
Foundational Hands-On Skills for Succeeding with Real Data Science Projects This pragmatic book introduces both machine …
Kubeflow Operations Guide
When deploying machine learning applications, building models is only a small part of the story. The …
The Self-Service Data Roadmap
Data-driven insights are a key competitive advantage for any industry today, but deriving insights from raw …