Paco Nathan

Paco Nathan

Apache Spark, Cluster Computing, Data Science, Big Data

  • @pacoid
  • + Paco Nathan

Mountain View, California

Areas of Expertise:

  • Spark
  • Mesos
  • Python
  • Scala
  • Machine Learning
  • Data Science
  • Cascading
  • Cascalog
  • Scalding
  • Cloud Computing
  • R
  • Big Data
  • Open Data
  • Text Analytics
  • NLP
  • PMML
  • Distributed Systems
  • speaking
  • training
  • writing
Paco Nathan, is known as a "player/coach" data scientist who's led innovative Data teams building large-scale apps for 10+ years. A recognized expert in distributed systems, machine learning, and Enterprise data workflows, Paco is an O'Reilly author, OSS evangelist for Apache Spark with Databricks, and an advisor for Amplify Partners and Galvanize Paco received his BS Math Sci and MS Comp Sci degrees from Stanford University, and has 25+ years technology industry experience ranging from Bell Labs to early-stage start-ups. Newsletter and "official" web site: http://liber118.com/pxn/

Enterprise Data Workflows with Cascading Enterprise Data Workflows with Cascading
by Paco Nathan
July 2013
Print: $39.99
Ebook: $33.99

Introduction to Apache Spark Introduction to Apache Spark
by Paco Nathan
March 2015
Video: $99.99

Just Enough Math Just Enough Math
by Paco Nathan
May 2014
Video: $129.99

Paco blogs at:

Ask the Readers: What expense do you most want to dump?

May 29 2015

This article is by editor Linda Vergon. Is there a bill you pay that you absolutely detest? Occasionally, I’ll get an attitude about paying one bill or another. (Ha! Paying taxes on April 15 is one bill that comes to mind immediately, for example.) I recognize that there is a… read more

9.3 trillion reasons fintech could change the developing world

May 29 2015

Request an invitation to Next:Money, O’Reilly’s conference focused on the fundamental transformation taking place in the finance industry. A relatively commonplace occurrence — credit card fraud — made me reconsider the long-term impact of financial technology outside the Western world. … read more

9.3 trillion reasons fintech could change the developing world

May 29 2015

Request an invitation to Next:Money, O’Reilly’s conference focused on the fundamental transformation taking place in the finance industry. A relatively commonplace occurrence — credit card fraud — made me reconsider the long-term impact of financial technology outside the Western world. … read more

Four short links: 29 May 2015

May 29 2015

Using Logs to Build Solid Data Infrastructure — (Martin Kleppmann) — For lack of a better term I’m going to call this the problem of “data integration”. With that I really just mean “making sure that the data ends up … read more

Four short links: 29 May 2015

May 29 2015

Using Logs to Build Solid Data Infrastructure — (Martin Kleppmann) — For lack of a better term I’m going to call this the problem of “data integration”. With that I really just mean “making sure that the data ends up … read more

Applied DevOps and the potential of Docker

May 28 2015

Editor’s note: this post is from Karl Matthias and Sean P. Kane, authors of “Docker Up & Running,” a guide to quickly learn how to use Docker to create packaged images for easy management, testing, and deployment of software. At … read more

Applied DevOps and the potential of Docker

May 28 2015

Editor’s note: this post is from Karl Matthias and Sean P. Kane, authors of “Docker Up & Running,” a guide to quickly learn how to use Docker to create packaged images for easy management, testing, and deployment of software. At … read more

Webcast: Computational Thinking: Just Enough Math
June 04, 2014
In the webcast, we'll review some of the historical context that led to machine learning techniques.

Webcast: Getting Started Running Apache Spark on Apache Mesos
January 24, 2014
This tutorial shows a simple way to launch a Mesos cluster in the cloud, how to configure run Spark on Mesos, then how to run jobs in Spark.

Webcast: Enterprise Data Workflows with Cascading
September 17, 2013
In this hands-on webcast presented by Paco Nathan author of Enterprise Data Workflows with Cascading, he will discuss what defines a workflow , in contrast to notions of dataflow and the impact that has on the tools required.