As this book is about Spark, it makes perfect sense to start the first chapter by looking into some of Spark’s history and its different components. This introductory chapter is divided into three sections. In the first, I go over the evolution of data and how it got as far as it has, in terms of size. I’ll touch on three key aspects of data. In the second section, I delve into the internals of Spark and go over the details of its different components, including its architecture and modus operandi. The third and final section of this chapter focuses on how to use Spark in a cloud environment. ...
Get Learn PySpark: Build Python-based Machine Learning and Deep Learning Models now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.