Before understanding Spark, it is imperative to understand the reason behind this deluge of data that we are witnessing around us today. In the early days, data was generated or accumulated by workers, so only the employees of companies entered the data into systems and the data points were very limited, capturing only a few fields. Then came the internet, and information was made easily accessible to everyone using it. Now, users had the power to enter and generate their own data. This was a massive shift as the number of internet users grew exponentially, and the data created ...
1. Evolution of Data
Get Machine Learning with PySpark: With Natural Language Processing and Recommender Systems now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.