Preface

The future of data science and artificial intelligence has never looked brighter. AI now beats humans at games ranging from twitchy, reflexive Pong to deep, contemplative Go. Deep learning models recognize objects nearly as well as humans. Some even say self-driving cars perform better than their distracted human counterparts. The past decade’s massive gains in data volume, storage capacity, and computing power have enabled rapid advances in data science.

And of course technology has spread into every facet of your business (from finance and sales to production and logistics). However, is each part of your business turbocharged by data science and AI? Likely not. As wonderful as they might be, if you are not designing a self-driving car or predicting customer behavior, you are probably not using these technologies.

Many organizations may have access to business data from an enterprise resource planning (ERP) system such as SAP, and yours is likely no different. Data coming from a business system such as SAP is largely perfect as often validations and checks are in place before it is allowed to save to the database (and, one of the most essential and least rewarding tasks of a data scientist is cleaning the data). This means ERP data in SAP is ripe for the picking, and data science is here to do the harvesting!

Let’s take a hypothetical scenario. The SAP Team at Big Bonanza Warehouse is in a constant state of process improvement. They know how to configure their SAP system to do the tasks their users want, and they play that system like a fiddle, dutifully taking requests and delivering solutions. However, there is a bit of a problem with reporting and analytics; they have a data warehouse and a business intelligence system, but developing reports is a multimonth process. The team often resorts to using standard ALV (ABAP List Viewer) reports, which are quite limited in power because they require a developer to code; in addition, it is very hard to harness the wealth of public data that could be used in conjunction with SAP. Just like at countless other enterprises, SAP data at Big Bonanza Warehouse is an island, siloed within its own system. Teams that don’t work with SAP have no idea what’s in there, and the teams that do work with it spend so much time maintaining the systems that they don’t get the chance to look outside them.

SAP data shouldn’t be an island, though. The team knows their data, how to find it, and what they want to do with it. However, when it comes to analyzing that data, everyone’s hands are tied by that multimonth report development process.

Sound familiar? It’s the story at nearly every SAP shop with whom we’ve ever worked. And that’s a lot in our combined 30+ years of experience.

We want to give that SAP team (and yours!) some modern insight—tools and techniques they can use without defining data cubes, data warehouse objects, or learning complex frontend reports. In this book, we’ll present simple scenarios such as dumping data straight out of SAP into a flat file and into a reporting tool. This is useful for ad-hoc reporting and investigations. We’ll also consider more complex scenarios, including using extractor tools and neural network models in the cloud to analyze data in ways not possible within SAP or contemporary data warehouses.

How to Read This Book

You’ll need to approach this book from a conceptual point of view. We present alternative techniques for analyzing business data.We ask—nay, we beg—the reader to think about business data (in particular SAP data) in new and interesting ways. This book is designed to awaken ideas around how to bridge the gap between your particular business data and the advances in data science. You need not be an expert in the complex algorithms that calculate gradient descent in a neural network, nor do you need to be an expert in your business data. But you do need to have a desire to straddle these two camps and have fun in the process.1

From the data scientist’s perspective, the data science principles in this book are an introduction. If you can spot a sigmoid, tanh, or relu activation function at fifty paces, you can skip those parts. But we’re betting that if your guru level is that high in data science, you’re a novice at the SAP stuff. Focus in on the SAP stories, showing you how to pull things out and demonstrating working with the real business data in the system.

From the SAP professional’s perspective, you’ll break out of traditional reporting and analytics models. You’ll learn to think of business applications and reporting in machine and deep learning terms. This may sound mystical, but by the end of the book you will have the tools necessary to take this step. Along the way you’ll automatically detect anomalies in sales data, predict the future from past data, process text as natural language, segment customers into smart groups, visualize all these things brilliantly, and teach bots to use business data.

In our world of AI and data science, asking the same old questions of your data is stale, naive, and (quite frankly) boring. We want you to ask questions of your data that you didn’t even know you could ask. Maybe the price of tea in China really does have an outsize effect on your sales.

From the developer’s perspective, you’ll be inspired to learn wonderful programming languages like Python and R. We don’t teach you these languages, but we challenge you to dip your toe into these warm and effervescent waters. If you are already an experienced R or Python developer, you’re in good shape for the code sections. For the novice, we will point you to resources to get you started. Don’t feel left out if you are inclined to use another language such as Java. The “meta” goal of this book is to get you to think of how to think of business data differently and if that means you want to use Java, by all means do so.

Operationalizing data science is a whole book in itself. We’ll frequently touch on how to operationalize ideas we present, but it is beyond the scope of this book to dive deep on creating robust pipelines.

Tip

Data scientists may be able to skip over Chapter 2. SAP professionals, you might be able to skip Chapter 3. The stories we tell later in the book merge these two disciplines, so we want readers who come from one or the other side to get a fair understanding of how we’ll be poking around to work our magic.

Conventions Used in This Book

The following typographical conventions are used in this book:

Italic

Indicates new terms, URLs, email addresses, filenames, and file extensions.

Constant width

Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.

Constant width bold

Shows commands or other text that should be typed literally by the user.

Constant width italic

Shows text that should be replaced with user-supplied values or by values determined by context.

Tip

This element signifies a tip or suggestion.

Note

This element signifies a general note.

Warning

This element indicates a warning or caution.

Using Code Examples

This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.

We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Practical Data Science with SAP by Greg Foss and Paul Modderman (O’Reilly). Copyright 2019 Greg Foss and Paul Modderman, 978-1-492-04644-8.” 

If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at .

O’Reilly Online Learning

Note

For almost 40 years, O’Reilly Media has provided technology and business training, knowledge, and insight to help companies succeed.

Our unique network of experts and innovators share their knowledge and expertise through books, articles, conferences, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, please visit http://oreilly.com.

How to Contact Us

Please address comments and questions concerning this book to the publisher:

  • O’Reilly Media, Inc.
  • 1005 Gravenstein Highway North
  • Sebastopol, CA 95472
  • 800-998-9938 (in the United States or Canada)
  • 707-829-0515 (international or local)
  • 707-829-0104 (fax)

We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/practical-data-sci-sap.

To comment or ask technical questions about this book, send email to .

For more information about our books, courses, conferences, and news, see our website at http://www.oreilly.com.

Find us on Facebook: http://facebook.com/oreilly

Follow us on Twitter: http://twitter.com/oreillymedia

Watch us on YouTube: http://www.youtube.com/oreillymedia

Acknowledgments

We would like to thank our technical reviewers Hau Ngo, Jesse Stiff, Franco Rizzo, Brad Barker, and Christoph Wertz for their valuable feedback. Each chapter was made better from their suggestions.

To Nicole, our fearless editor: you helped us stay calm and grounded in the process. Without your editorial guidance, we would have been lost in data scientific meanderings and code ramblings. Thank you for making each thing you touched more readable.

Greg would like to thank his wife Alycia for her patience, support, and insight and his brother Cory for help with the graphics. Of course, a leviathan thanks to Paul Modderman for his vision, ingenuity, and courage to embark on this journey.

Paul would like to thank his partner Christa Modderman for her wisdom and strength, his grandmother Lois Stratmann for the example set by her remarkable life in creative work, and his parents Mark and Linda for...everything. He wishes to acknowledge Tony Vanderpoel, Dean Stoffel, and Gavin Quinn for respectively encouraging, entrusting, and inspiring him to better himself professionally. Largest thanks goes to Greg Foss: a remarkable author who never backed down from a commitment to quality. Eleanor Modderman is and always will be his favorite.

Special thanks to Wade Krzmarzick for help with CRM scenarios.

1 If you’re not the kind of person who has fun with data, how did you find this book?

Get Practical Data Science with SAP now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.