Four short links: 23 January 2017

No Flakes in Production, In-Memory Database, MOOC Numbers, Pharma Rule of Thumb

By Nat Torkington
January 23, 2017
  1. Putting the Science Back in Data Science (Daniel Whitenack) — it’s not worth putting a flaky implementation of an analysis into production.
  2. Scuba (PDF) — paper from Facebook about their millions-of-rows/second in-memory event database Scuba stores data completely in memory on hundreds of servers, each with 144 GB RAM. I’m boggling at that scale.
  3. Learn faster. Dig deeper. See farther.

    Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

    Learn more
  4. 4 Years of MOOC Data — Harvard and MIT MOOC data analyzed. MOOCs educate thousands and certify hundreds. The median number of active participants in a course is 7,902; another 1,517 are considered “explorers” — those who explore half or more of the course content. […] Among those who stated up front that they intended to become certified, the median certification rate was 30%. Among those who paid for identity verification as part of the process of becoming certified, the median completion was 60%.
  5. Predicting Medical AI in 2017 — interested me for this rule of thumb: the chance of an eventual clinical product and the time until that product is available will be:
    Preclinical complete: 5% chance, 10 years
    Phase I complete: 10% chance, 8 years
    Phase II complete: 50% chance, 5 years
    Phase III complete: 80% chance, 1 year
Post topics: Four Short Links