Paco Nathan

Paco Nathan

Cluster Computing, Data Science, Big Data, Interdisciplinary Teams

  • @pacoid
  • + Paco Nathan

Mountain View, California

Areas of Expertise:

  • Spark
  • Mesos
  • Python
  • Scala
  • Machine Learning
  • Data Science
  • Cascading
  • Cascalog
  • Scalding
  • Cloud Computing
  • R
  • Big Data
  • Open Data
  • Text Analytics
  • NLP
  • PMML
  • Distributed Systems
  • consulting
  • speaking
  • programming
  • training
  • writing
Paco Nathan, is known as a "player/coach" data scientist who's led innovative Data teams building large-scale apps for 10+ years. A recognized expert in distributed systems, machine learning, and Enterprise data workflows, Paco is an O'Reilly author, OSS evangelist for Apache Spark with Databricks, and an advisor for Amplify Partners. Paco received his BS Math Sci and MS Comp Sci degrees from Stanford University, and has 25+ years technology industry experience ranging from Bell Labs to early-stage start-ups. Newsletter and "official" web site: http://liber118.com/pxn/

Enterprise Data Workflows with Cascading Enterprise Data Workflows with Cascading
by Paco Nathan
July 2013
Print: $39.99
Ebook: $33.99

Just Enough Math Just Enough Math
by Paco Nathan
May 2014
Video: $129.99

Recent Posts | All O'Reilly Posts

Paco blogs at:


Newsletter Updates for May 2014

May 27 2014

Been quite an interesting past month or so: DC, Austin, SF, Ann Arbor, Atlanta, Seattle… with hopefully much learned from those travels, plus many excellent events and introductions. Meanwhile, I learned much from this gem, Therbligs for data science: A nuts and bolts framework for accelerating data work, by Abe… read more

Ag+Data

April 16 2014

Two years ago an informal group met for drinks in downtown Palo Alto: a mix of grad students, investors, and data science experts in Silicon Valley. In the back and forth of our conversation, we took turns describing planned projects. … read more

Connected Devices Fellowship - O'Reilly Solid conf

April 07 2014

I'm advising Amplify Partners and they've launched a Connected Devices Fellowship that includes conference registration, airfare, and accommodations to attend the new O’Reilly Solid conference on May 21-22 in SF. The fellowship is designed for engineers, students, researchers, et al., who are passionate about infrastructure for IoT and connected devices: http://www.amplifypartners.com/ read more

Connected Devices Fellowship - O'Reilly Solid conf

April 07 2014

I'm advising Amplify Partners and they've launched a Connected Devices Fellowship that includes conference registration, airfare, and accommodations to attend the new O’Reilly Solid conference on May 21-22 in SF. The fellowship is designed for engineers, students, researchers, et al., who are passionate about infrastructure for IoT and connected devices: http://www.amplifypartners.com/ read more

Newsletter Updates for April 2014

April 02 2014

If you have not seen Data Science Folk Knowledge by Krishna Sankar, that is packed full o’ gems about Machine Learning. Following up on more follow-ups from Strata SC 2014, I’d like to point to an excellent article: 5 Steps to Thinking Like a Designer in Machine Learning by Kevin Dalias: We’ve all heard the saying that… read more

Newsletter Updates for March 2014

March 02 2014

Strata SC 2014 was a busy time indeed. I’m grateful to have had the opportunity to introduce speakers for several excellent presentations – in addition to presenting about Apache Mesos and meeting with many interesting people who were attending the conf. The keynotes this time were diverse, including brilliant and… read more

Newsletter Updates for March 2014

March 02 2014

Strata SC 2014 was a busy time indeed. I’m grateful to have had the opportunity to introduce speakers for several excellent presentations – in addition to presenting about Apache Mesos and meeting with many interesting people who were attending the conf. The keynotes this time were diverse, including brilliant and… read more

Newsletter Updates for December 2013

March 02 2014

Hard to believe it’s been since AMPCamp 3 in August that I’ve had a editor buffer open, collecting notes to write up… New Years Resolutions include writing a newsletter on a monthly basis! AMPCamp 3 was a big success: over 200 people attended a two-day marathon of hands-on work with the Berkeley Stack. Spark Summit doubled… read more

Learning Apache Mesos

January 06 2014

In the summer of 2012, Accel Partners hosted an invitation-only Big Data conference at Stanford. Ping Li stood near the exit with a checkbook, ready to invest $1MM in pitches for real-time analytics on clusters. However, real-time means many different … read more

Apache Mesos: Open Source Datacenter Computing

January 06 2014

Virtual machines (VMs) have enjoyed a long history, from IBM’s CP–40 in the late 1960s on through the rise of VMware in the late 1990s. Widespread VM use nearly became synonymous with “cloud computing” by the late 2000s: public clouds, … read more

Recent Posts | All O'Reilly Posts

Webcast: Computational Thinking: Just Enough Math
June 04, 2014
In the webcast, we'll review some of the historical context that led to machine learning techniques.

Webcast: Getting Started Running Apache Spark on Apache Mesos
January 24, 2014
This tutorial shows a simple way to launch a Mesos cluster in the cloud, how to configure run Spark on Mesos, then how to run jobs in Spark.

Webcast: Enterprise Data Workflows with Cascading
September 17, 2013
In this hands-on webcast presented by Paco Nathan author of Enterprise Data Workflows with Cascading, he will discuss what defines a workflow , in contrast to notions of dataflow and the impact that has on the tools required.