Big Data Tools and Pipelines

Ideas and resources related to data tools.

Dive into scikit-learn

With scikit-learn, you can deploy machine learning models in just a few lines of code. Andreas Mueller summarizes the classification, regression, and clustering algorithms in this powerful machine learning library.

Let’s get real: Acting on data in real time

Companies are differentiating themselves by acting on data in real time. But what does “real time” really mean? Jack Norris discusses the challenges of coordinating data flows, analysis, and integration at scale to shape business as it happens.