O'Reilly Webcasts Webcasts RSS Feed

Register for an upcoming free, live webcast or browse our on-demand archive of past events.

There are no upcoming webcasts at this time. Please check back soon.

View Archived Webcasts by Category:

Video Archive: Data

Webcasts are made available as a video shortly following each live event.

Improving industrial monitoring with deep learning

By Ben MacKenzie, Bilal Paracha | June 20, 2018
This webcast discusses a new data and analytics architecture that enables significant improvements in ongoing operational “Industrial Inspection.”

Accelerate your business with DevOps and the cloud

By Steve Francis, Nigel Kersten | May 10, 2018
The cloud (in all its forms) allows your organization to deliver more value to customers faster.

http://cdn.oreillystatic.com/images/people/weblogs/anthony_stevens-50.jpgAnthony Stevens

Deep Learning: From basic principles to training and deploying models in production

By Anthony Stevens | April 19, 2018
Deep learning is currently one of the hottest areas in data science. Increasingly, businesses are applying it to gain competitive advantage. According to Gartner, eighty percent of data scientists will have deep learning in their toolkits by 2018. There...

Transform your business app with low-code, APIs, and microservices

By Leon Stigter, Bruno Trimouille | April 12, 2018
This webcast discusses digital business platforms and their required capabilities, APIs, microservices, and low-code, and how different types of users can leverage a digital business platform for shared benefits.

Designing infrastructure that turns legacy data into insight

By Edward Hsu, Shailan Lala | March 08, 2018
To reshape the future of healthcare technology services, athenahealth knew they needed to unlock the data trapped in a monolithic stack built with homegrown tooling around an Oracle database. Simple data queries took minutes, which made building the ...

Using artificial intelligence to fight financial crimes

By Catharine Evans, Avishkar Misra, Simon Moss | February 27, 2018
Financial crimes remain a major ongoing cost to businesses. Whether perpetrating credit card fraud and money laundering in the banking industry, or fraud, waste, and abuse in the healthcare field, financial criminals relentlessly devise new attacks that...

http://cdn.oreillystatic.com/images/people/weblogs/alice_laplante-50.jpgAlice LaPlante

Why integration is the key component of a digital business platform

By Alice LaPlante | February 22, 2018
This webcast discusses the key infrastructure of digital business platforms, including pervasive integration, event-driven microservices, edge-computing, and machine learning scenarios.

http://cdn.oreillystatic.com/images/people/weblogs/brian_womack-50.jpgBrian Womack

Creating data labels to adapt to contextual change

By Brian Womack | February 20, 2018
Today, traditional artificial intelligence (AI) and machine learning (ML) largely depend upon data scientists to formulate labels associated with feature vectors that take into account model attributes such as entity, action, or relationship. A key challenge...

Implementing AI systems with interpretability, transparency, and trust

By Mustafa Kabul, Ilknur Kaynar-Kabul | January 25, 2018
This webcast will focus on different machine learning and visualization techniques that can be used to make complex artificial intelligence systems interpretable, transparent and trustable.

How to use machine learning to scale data quality

By Pranav Rastogi, Mark Balkenende | November 16, 2017
Machine learning helps pinpoint errors in large datasets for cleansing before entering the analytics pipeline. This webcast shows you how to set it up.

http://cdn.oreillystatic.com/images/people/weblogs/tom_markiewicz-50.jpgTom Markiewicz

Using natural language processing to build applications in the enterprise

By Tom Markiewicz | November 15, 2017
Join Tom Markiewicz as he introduces NLP and discusses real-world examples of how NLP has been used to build applications in the enterprise.

http://cdn.oreillystatic.com/images/people/weblogs/mike_boyarski-50.jpgMike Boyarski

Closing the gap between data science and real-time applications

By Mike Boyarski | November 07, 2017
This webcast discusses new advances in data processing, including scalable SQL and real-time data pipelines, and their impact on machine learning powered applications.

http://cdn.oreillystatic.com/images/people/weblogs/damon_feldman-50.jpgDamon Feldman

Cleaning up your enterprise data architecture

By Damon Feldman | November 02, 2017
In this webcast, Damon Feldman will discuss how an operational data hub can be combined with the data lake.

Adopting an enterprise-wide shared data lake to accelerate business insights

By Carlos Matos, Ben Sharma | September 21, 2017
During this webcast, AIG's VP of Global Data Platforms, Carlos Matos, and Zaloni CEO, Ben Sharma will share insights from their real-world experience and discuss best practices for architecture, technology, data management and governance to enable centralized...

Real-time marketing analytics with stream processing

By Murthy Mathiprakasam, George Willard | September 19, 2017
Join this O'Reilly webcast for a discussion of the lessons learned in applying streaming analytics to online customer interactions.

http://cdn.oreillystatic.com/images/people/weblogs/emre_velipasaoglu-50.jpgEmre Velipasaoglu

What's the Role of Machine Learning in Fast Data and Streaming Applications?

By Emre Velipasaoglu | September 12, 2017
This webcast by Emre Velipasaoglu, Principal Data Scientist at Lightbend, is for busy Architects and Managers of Streaming and Fast Data applications who are looking to get a handle on what Machine Learning (ML) is really all about.

http://cdn.oreillystatic.com/images/people/weblogs/andreas_pfadler-50.jpgAndreas Pfadler

Large-scale Machine Learning in Spark

By Andreas Pfadler | August 29, 2017
In this webcast we'll introduce Fregata : A light-weight-large-scale Machine Learning Library on Spark.

Quality, Service, Value: Whole Foods Talks About Move to Cloud for Analytics

By Richard Beaver, Ken Casey, Doug Henschen | August 22, 2017
Hear Whole Foods describe their experience and come away with practical knowledge.

http://cdn.oreillystatic.com/images/people/weblogs/dean_wampler1-50.jpgDean Wampler

How To Choose The Right Streaming Engine

By Dean Wampler | July 20, 2017
In this webcast, Dean Wampler walks you through the criteria you need to consider when selecting technologies.

Make your Data Over the Counter

By Jenny Grant Rankin, Shane Swiderek | July 18, 2017
Learn the over-the-counter data standards for effective data delivery.

http://cdn.oreillystatic.com/images/people/weblogs/ken_krupa-50.jpgKen Krupa

Integrate data silos and speed-up data prep

By Ken Krupa | June 27, 2017
In this webcast MarkLogic CTO Ken Krupa will discuss how an operational data hub functions and leverages multiple models.

http://cdn.oreillystatic.com/images/people/weblogs/andy_hickl-50.jpgAndy Hickl

Transparency in AI Decision Making

By Andy Hickl | June 22, 2017
This webcast will discuss the problem of interpretability in AI and address techniques for building transparent artificial intelligence applications with explainable outcomes.

Data pipelines made simple(r): Data ingestion and management with Kylo

By Matt Hutton, Scott Reisdorf | May 25, 2017
This webcast will introduce Kylo, an open-source data lake platform based on Apache Spark and NiFi.

Infusing Modern Data Visualization & Analytics into Your App

By Jay Bala, Ian Fyfe | April 25, 2017
Join us in this webcast as we discuss whether to build or buy an integrated analytics solution, how to evaluate the available offerings for embedded analytics, how to most efficiently embed data visualization and analytics into your product, and more...

http://cdn.oreillystatic.com/images/people/weblogs/rahul_kamdar-50.jpgRahul Kamdar

API-led Integration in a Multi-cloud World

By Rahul Kamdar | April 20, 2017
Register for the webcast, API-led Integration in a Multi-cloud World , on April 20th as we explore how to build for a multi-cloud world and leverage pre-existing code, projects, and assets.

http://cdn.oreillystatic.com/images/people/weblogs/mike_boyarski-50.jpgMike Boyarski

Machines and the Magic of Fast Learning

By Mike Boyarski | April 06, 2017
Steven Camiña, MemSQL Product Manager, will walk through critical technologies needed in your technology ecosystem, including Python, Apache Kafka, Apache Spark, and a real-time database.

http://cdn.oreillystatic.com/images/people/weblogs/michael_stonebraker1-50.jpgMichael Stonebraker

Practical strategies for data unification, with Dr. Michael Stonebraker

By Michael Stonebraker | March 21, 2017
Join Turing Award winner Dr. Michael Stonebraker for an O'Reilly Media webcast on Why schema first strategies for data unification are a disaster.

http://cdn.oreillystatic.com/images/people/weblogs/greg_meddles-50.jpgGreg Meddles

How to Scale Different Data Models

By Greg Meddles | March 02, 2017
In this webcast, join Greg Meddles, Sr. Principal Consultant for MarkLogic, to learn how complex applications and systems must be designed to scale horizontally — both at the software and the hardware level.

http://cdn.oreillystatic.com/images/people/weblogs/jean_rene_gautier-50.jpgJean-René Gauthier

An Intro to Predictive Modeling for Customer Lifetime Value

By Jean-René Gauthier | February 28, 2017
In this webcast, we explain the ins and outs of probabilistic models that can be used to quantify the future value of a customer.

http://cdn.oreillystatic.com/images/people/weblogs/peter_cheslock-50.jpgPete Cheslock

How to get started with DevOpSec

By Pete Cheslock | February 23, 2017
This webcast will discuss how to improve your security posture, whether you have a security team or not.

http://cdn.oreillystatic.com/images/people/weblogs/jesse_anderson1-50.jpgJesse Anderson

Spark and Java - Yes they work together!

By Jesse Anderson | January 24, 2017
In this webcast, Jesse Anderson demonstrates how to create Java lambdas and integrate them with Spark to process data.

http://cdn.oreillystatic.com/images/people/weblogs/steven_camina-50.jpgSteven Camina

Building the Ideal Stack for Machine Learning

By Steven Camina | January 19, 2017
Steven Camiña, MemSQL Product Manager, will walk through critical technologies needed in your technology ecosystem, including Python, Apache Kafka, Apache Spark, and a real-time database.

http://cdn.oreillystatic.com/images/people/weblogs/alex_gorelik1-50.jpgAlex Gorelik

How to build a successful enterprise data lake

By Alex Gorelik | January 12, 2017
Alex Gorelik discusses the considerations of and best practices for building data lakes, with examples taken from the world's leading big data companies and enterprises.

What’s coming for big data in 2017?

By Peter Jeffcock, Jeff Pohlmann | December 13, 2016
This webcast brings together some of Oracle's leading experts to give useful—and provocative—insight into what's coming over the horizon.

http://cdn.oreillystatic.com/images/people/weblogs/benjamin_bengfort.jpgBenjamin Bengfort

Data Product Architectures

By Benjamin Bengfort | December 07, 2016
Benjamin Bengfort discusses the data product life-cycle and outlines the Lambda Architecture, demonstrating how to engage a model build, evaluation, and selection phase with an operation and interaction phase.

Data Preparation State of the Union

By Joe Hellerstein, Ihab Ilyas, Toph Whitmore | December 06, 2016
Join Tamr Co-Founder Ihab Ilyas, and Trifacta Co-Founder Joseph Hellerstein, as they deliver a State of the Union on data preparation, and discuss how two distinct methods: self-service data preparation and enterprise data unification solve fundamentally...

UX field research basics

By David Farkas, Brad Nunnally | December 01, 2016
Brad Nunnally and David Farkas highlight some of the key principles involved in any researcher's toolkit and explore lessons learned from the three main stages of conducting field studies.

http://cdn.oreillystatic.com/images/people/weblogs/damon_feldman-50.jpgDamon Feldman

What is a Multi-Model Database: Two paths of multi-model engineering

By Damon Feldman | December 01, 2016
In this webcast we will examine the two paths of multi-model database engineering: a single platform that allows many models on one core, versus complex integrations where many systems are pre-packaged.

How Facebook built a
self-service data infrastructure

By Ashish Thusoo, Ravi Murthy | November 17, 2016
Please join Ashish Thusoo, Co-Founder and CEO of Qubole—a big data as a service company—and former head of Facebook's Data Infrastructure team that pioneered the self-service data infrastructure model as he shares just how he did this at ...

Time for Action: Bring Operational Reporting to the 21st Century

By Teodor Danciu, Ernesto Ongaro | November 16, 2016
In this webcast, you will learn how reporting is evolving as a result of modern APIs, datasources, delivery methods and architectures and how to use them to your advantage.

Solving key mobile development challenges with NoSQL

By Wayne Carter, Ali LeClerc | November 15, 2016
In this webcast, we'll use the airline industry as an example to take you through common mobile use cases and how major industries are solving for today's mobile challenges.

Integrating Apache Spark and NiFi for Data Lakes

By Ron Bodkin, Matt Hutton | November 10, 2016
This webcast will introduce Kylo, a soon-to-be-open-source data lake orchestration framework based on Apache Spark and NiFi.

http://cdn.oreillystatic.com/images/people/weblogs/brian_bulkowski-50.jpgBrian Bulkowski

Scaling During Hypergrowth: Lessons Learned at AppNexus

By Brian Bulkowski | October 25, 2016
Join us for this live webcast to hear about the operational techniques AppNexus used to scale up a core component of its infrastructure—the server-side cookie store—even as a substantial portion of the world's internet advertising flowed ...

http://cdn.oreillystatic.com/images/people/weblogs/aviad_harell-50.jpgAviad Harell

How Democratizing Embedded Analytics is Changing the BI Game

By Aviad Harell | October 18, 2016
In this webcast, you will learn how a single-stack BI architecture will speed time to implementation while saving on hardware and software costs.

http://cdn.oreillystatic.com/images/people/weblogs/ben_sharma-50.jpgBen Sharma

Techniques to establish your data lake: How to achieve data quality and security

By Ben Sharma | October 13, 2016
In this free webcast, you'll learn techniques that allow you to balance the flexibility a data lake can provide, with the requirements for privacy and security that are critical for enterprise data.

Deriving value from the data lake

By Nik Rouda, John Thuma | October 06, 2016
We surveyed 200 IT and business professionals to find best practices and sticking points for data lake usage. We'll share our results in this webcast, along with insights into why businesses still struggle to drive value from their Hadoop data lake&mdash...

http://cdn.oreillystatic.com/images/people/weblogs/jesse_anderson1-50.jpgJesse Anderson

Engineering big data solutions

By Jesse Anderson | October 04, 2016
In this webcast, Jesse Anderson (CEO, Smoking Hand) will cover some of the ways that management teams can set-up their data engineering team for success.

http://cdn.oreillystatic.com/images/people/weblogs/ric_messier-50.jpgRic Messier

Security 3.0: Looking out for the silent stakeholders

By Ric Messier | September 28, 2016
Ric Messier explores the concept of risk in relation to information assets. By the end of this webcast, you'll be able to appropriately determine protections for assets that are most at risk.

http://cdn.oreillystatic.com/images/people/weblogs/larry_lancaster-50.jpgLarry Lancaster

Techniques for identifying business value and aligning stakeholders

By Larry Lancaster | September 20, 2016
Join Larry Lancaster, Founder of Zebrium, for a blueprint on how to get buy-in for your big data project.

http://cdn.oreillystatic.com/images/people/weblogs/thomas_nield-50.jpgThomas Nield

A quick lesson on SQL querying basics

By Thomas Nield | September 14, 2016
This hands-on lesson will walk through basic table navigation using SELECT statements, including filtering, mathematical expressions, text concatenation, and leveraging functions.

http://cdn.oreillystatic.com/images/people/weblogs/zoltan_prekopcsak-50.jpgZoltan Prekopcsak

Best practices for using predictive analytics to extract value from Hadoop

By Zoltan Prekopcsak | September 13, 2016
Zoltan Prekopcsak outlines the best practices that make life easier, simplify the process, and implement results faster, helping you organize approaches and select the right approach for the task.

http://cdn.oreillystatic.com/images/people/weblogs/bradley_holt1-50.jpgBradley Holt

Designing Data Layers for Modern Web and Mobile Apps

By Bradley Holt | September 01, 2016
Learn how you can leverage both relational and NoSQL databases, as well as managed services, to improve your apps and provide compelling systems of engagement.

Scalable Data Science with R

By Roger Fried, Brian Kreeger | August 16, 2016
Join Teradata Roger Fried, Senior Data Scientist and Brian Kreeger, Senior Data Scientist and local R User Group organizer, with over 30 collective years of experience designing and implementing big data analytic solutions in healthcare, finance and ...

Detecting Anomalies in IoT with Time Series Analysis

By Cheryl Wiebe, Todd Morley | July 26, 2016
Join us for a live webcast where you will learn how to overcome common challenges in finding anomalies in time series data.

Deploying Mission Critical Applications on Hadoop, On-premises and in the Cloud

By James Campigli, Jim Wankowski | July 21, 2016
In this webcast, we'll cover solutions for operationalizing Hadoop to achieve enterprise-grade levels of availability and performance, both on-premises and in the cloud.

Federating Data with Presto to Build an Enterprise Data Portal

By Todd Nemet, Suraj Patel, Mark Shainman | July 19, 2016
Join us as we dive deep into helping you understand why so many of today's leading companies are using flexible data models as the foundation for their data strategies.

Turbo-Charge Enterprise Analytics on Hadoop

By Anand Bisen, Tony Wu | July 07, 2016
This webcast will: discuss the technical innovations and techniques used to accelerate Apache HBase's performance by orders of magnitude, demonstrate ultrafast Apache HBase running on EMC DSSD & Cloudera CDH 5.6, and highlight the core use cases ...

Best practices for streaming applications

By Mark Grover, Ted Malaska | June 21, 2016
Mark Grover and Ted Malaska offer an overview of projects that can be used for streaming applications, including Kafka, Flume, and Spark Streaming, and discuss the various architectural schemas available, such as Lambda and Kappa Architectures.

Understanding Metadata: Why it's essential to your big data solution and how to manage it well

By Ben Sharma, Vikram Sreekanti | June 21, 2016
In this O'Reilly webcast, Ben Sharma (cofounder and CEO of Zaloni) and Vikram Sreekanti (software engineer in the AMPLab at UC Berkeley) discuss the value of collecting and analyzing metadata, and its potential to impact your big data solution and your...

http://cdn.oreillystatic.com/images/people/weblogs/danielle_dean-50.jpgDanielle Dean

Predictive maintenance meets predictive analytics

By Danielle Dean | June 16, 2016
In a talk aimed at data scientists, students, researchers, and nontechnical professionals, Danielle Dean introduces the landscape and challenges of predictive maintenance applications in the manufacturing industry.

http://cdn.oreillystatic.com/images/people/weblogs/alexander_ulanov-50.jpgAlexander Ulanov

Distributed deep learning on Spark

By Alexander Ulanov | June 15, 2016
Alexander Ulanov offers an overview of a number of different tools and frameworks that have been proposed for performing deep learning on Spark and compares them.

http://cdn.oreillystatic.com/images/people/weblogs/jeffrey_breen-50.jpgJeffrey Breen

Get Your Data Lake Right the First Time

By Jeffrey Breen | June 14, 2016
This webcast shows lessons learned from over a dozen data lake implementations.

http://cdn.oreillystatic.com/images/people/weblogs/david_wang-50.jpgDavid Wang

Enterprise-grade Hadoop: Unlock the value of the data lake and scale without compromise

By David Wang | June 09, 2016
HPE's hundreds of data scientists, and 3000+ dedicated global analytics & data management professionals are ready to help you unlock the value hidden in your data.

How to Leverage Spark and NoSQL for Data Driven Applications

By Will Gardella, Michael Nitschinger | May 31, 2016
Your web, mobile, and IoT applications generate an endless stream of information that can improve the operational efficiency and insight of your business – but only if you have the right technology to quickly capture and analyze the data.

Efficient state management with Spark and in-memory databases

By Barzan Mozafari, Jags Ramnarayan | May 25, 2016
Barzan Mozafari and Jags Ramnarayan present a design that combines Spark with an open source in-memory database—SnappyData—that equally scales and collocates its partitions with those of Spark, effectively offering state for stream processing...

http://cdn.oreillystatic.com/images/people/weblogs/carey_james-50.jpgCarey James

Big data for business outcomes: How to begin and follow-through on your data strategy

By Carey James | May 24, 2016
In this webcast, Carey James will give an overview of key milestones along the big data journey, obstacles that many organizations encounter in the process, and technology solutions to consider at each step.

http://cdn.oreillystatic.com/images/people/weblogs/scott_arnett-50.jpgScott Arnett

Don’t forget the fourth V - veracity! What you need to know about data quality in Hadoop.

By Scott Arnett | May 19, 2016
In this webcast, you will learn how to blend traditional questions with a new way of thinking to ensure data quality.

http://cdn.oreillystatic.com/images/people/weblogs/allen_downey.jpgAllen B. Downey

Learning to love Bayesian statistics

By Allen B. Downey | May 18, 2016
In this webcast I unpack these myths and explain the pros and cons of Bayesian methods compared to classical statistics.

http://cdn.oreillystatic.com/images/people/weblogs/evan_sparks-50.jpgEvan Sparks

KeystoneML: Optimized large-scale machine-learning pipelines on Apache Spark

By Evan Sparks | May 17, 2016
You'll learn the KeystoneML programming model, how to work with KeystoneML to construct new pipelines, how salient aspects of the KeystoneML optimizer work, and how KeystoneML achieves high performance and scalable model training while maintaining a ...

http://cdn.oreillystatic.com/images/people/weblogs/sean_suchter-50.jpgSean Suchter

Ensuring QoS in Multi-tenant Hadoop Environments: Eliminate contention and guarantee SLAs

By Sean Suchter | May 17, 2016
Join Pepperdata co-founder and CEO Sean Suchter in this webcast to learn how you can: automatically run more jobs, faster, automatically prevent performance issues, and spend 90% less time troubleshooting

Creating Addictive Data Experiences in Modern Applications

By Ernesto Ongaro, Steve Wexler | May 03, 2016
Join us to learn about approaches for embeddable analytics so you can delight customers while alleviating internal development pressure.

http://cdn.oreillystatic.com/images/people/weblogs/john_hugg1-50.jpgJohn Hugg

Stream and state: Robust and stable solutions for streaming transactions

By John Hugg | April 14, 2016
In this webcast, John Hugg, Founding Engineer at VoltDB, will explore what's possible when systems integrate event processing with state management in a consistent, transactional way.

http://cdn.oreillystatic.com/images/people/weblogs/jay_kreps1-50.jpgJay Kreps

Building a Real-time Streaming Platform Using Kafka Streams and Kafka Connect

By Jay Kreps | April 07, 2016
This presentation will give a brief introduction to Apache Kafka and describe it's usage as a platform for streaming data.

How Data Science and Spend Analytics Found $100 Million+ in Savings

By Matthew Holzapfel, Eliot Knudsen | April 05, 2016
In this hands-on session, you'll learn about how Big Data has changed the definition of spend analytics.

Data Modeling, Data Querying, and NoSQL: A Deep Dive

By Laurent Doguin, Prasad Varakur | March 31, 2016
In this talk we'll look at how it all starts by getting the data model right and understanding what patterns can help avoid common traps like hot spots and problems with concurrent access.

http://cdn.oreillystatic.com/images/people/weblogs/ron_bodkin-50.jpgRon Bodkin

Continuous Applications: Spark, Kafka, Beam, and Beyond

By Ron Bodkin | March 24, 2016
In this webcast, we look at options for building complex analytic big data applications including tradeoffs for simplicity, completeness, and changing semantics over time backed by rich query engines.

Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spark Streaming

By Evan Chan, Helena Edelson | March 16, 2016
In this webcast, we will demonstrate some use cases of FiloDB that have enabled flexible ad-hoc analytics for a new generation of real-time Spark and Spark Streaming applications, while also simplifying the stack.

How the machine learning wave is changing the way organizations look at analytics

By Patrick Hall, Andrew Pease | March 10, 2016
In this webcast, we will identify how organizations are automating analytic processes in order to free up time for new analytics, new data, and new business problem domains, ultimately creating real competitive advantage.

http://cdn.oreillystatic.com/images/people/weblogs/sean_suchter-50.jpgSean Suchter

Overcome the limitations of distributed computing with real-time intelligence

By Sean Suchter | March 01, 2016
Join us for this webcast as we survey some of these “best practices” and offer up some new ways to address the performance gap. We’ll also tell you the warning signs to look out for, so you can assess the health and production readiness...

Strategic approaches to real-world architectural challenges

By Lawrence Finn, Robert Hurlbut, Juval Löwy, Alex Silva | March 01, 2016
Join four seasoned software architects as they recount larger-than-life architectural challenges and share their strategies and solutions for dealing with them.

Concurrency, co-existence and complexity: Three keys to implementing SQL on Hadoop in the real world

By Satish Sathiyavageswaran, Hochan Won | February 02, 2016
SQL has long been the most widely used language for big data analysis. The SQL-on-Hadoop ecosystem is loaded with both commercial and open source alternatives, each offering tools optimized for various use cases. Fledgling analytical engines are in incubation...

http://cdn.oreillystatic.com/images/people/weblogs/john_piekos-50.jpgJohn Piekos

Where, When and How Fast: Harnessing Geospatial and Data Replication in Next Gen Fast Data Apps

By John Piekos | January 28, 2016
It’s a data-intensive world and your applications can only perform as fast as your data infrastructure. Building next gen applications now often require using geospatial analytics support, an advanced data pipeline and data replication...

Deliver business impact: Practical approaches to interactive data discovery and predictive analytics

By Tapan Patel, Wayne Thompson | January 27, 2016
In this webcast, we will use SAS Visual Analytics and SAS Visual Statistics to teach you how to quickly identify predictive drivers.

Big Data in the Enterprise: We Need an “Easy Button” for Hadoop

By Michael A. Greene, Kumar Sreekanti | January 26, 2016
Big data adoption has moved from experimental projects to mission-critical, enterprise-wide deployments that deliver new customer insights, competitive advantage, and business innovation. According to IDC, the big data market is growing six times faster...

http://cdn.oreillystatic.com/images/people/weblogs/ben_sharma-50.jpgBen Sharma

Hybrid Data Architectures: Unified Hadoop Data Management — From Ground-to-Cloud

By Ben Sharma | January 21, 2016
During this webcast we will discuss a few key areas that will help ensure that you end up with a unified data lake and data management story.

Search: Harnessing the most intuitive interface for data analytics

By Ari Gesher, Anand Raghavan | December 17, 2015
In this webcast, you'll learn how to get beyond the create vs. consume dichotomy, by giving all users a familiar tool to interact with their data.

http://cdn.oreillystatic.com/images/people/weblogs/natalino_busa-50.jpgNatalino Busa

How to build an anomaly detection engine with Spark, Akka and Cassandra

By Natalino Busa | December 16, 2015
This webcast presents a solution for streaming anomaly detection: Coral . The Coral system is composed of three elements: a machine learning module, an event processing scoring module, and a data store that is implemented using Spark, Akka, and Cassandra...

Eliminating the Data Bottleneck in Procurement

By Matthew Holzapfel, Eliot Knudsen | December 14, 2015
In this webcast, you'll learn about the breakthrough methods in data preparation for gaining accurate procurement insight in days instead of months.

Intelligent Integration: How HP Vertica Uses Apache Kafka and Spark to Gain Quicker Business Insights

By Eamon O'Neill, Jeff Veis | December 10, 2015
Join Eamon O'Neill as we demonstrate how HP Vertica, a leading SQL analytics database, natively integrates with Apache Kafkfa and Spark to ingest data streams and high-volume, in-memory data for large-scale machine learning and graph analytics.

http://cdn.oreillystatic.com/images/people/weblogs/arvind_prabhakar-50.jpgArvind Prabhakar

Building a Continuously Curated Ingestion Pipeline: Recipes for Success

By Arvind Prabhakar | December 09, 2015
In this webcast you will discover: Recipes for building automated ingest pipelines that implement continual in-stream sanitization so that data lands in stores ready to consume, regardless of the complexity of collecting it.

Doing the work: Practical approach to Data Preparation for Advanced Analytics and Machine Learning.

By Michael Ames, Ryan Schmiedl | December 08, 2015
In this webcast, the guys from SAS will share their approach to acquiring, structuring and profiling data for analysis.

http://cdn.oreillystatic.com/images/people/weblogs/alex_pinto-50.jpgAlex Pinto

Secure Because Math? Challenges on Applying Machine Learning to Security

By Alex Pinto | December 08, 2015
This presentation will describe how information security is a different problem and the challenges intrinsic to this specific field that many first entrants seem to ignore.

http://cdn.oreillystatic.com/images/people/weblogs/oliver_ratzesberger1-50.jpgOliver Ratzesberger

Real Time Data Streams: What’s in it for Me?

By Oliver Ratzesberger | November 16, 2015
In this Webinar, you will hear Oliver Ratzesberger share his unique insights on best practices for ingesting and distributing data streams.

Essential Elements of a Data Driven Culture

By Carl Anderson, Jen Grant | November 12, 2015
Join us to get ideas for your data governance, analytic use cases and processes for getting people to use the data and not fight over it.

The Offline Challenge: Delivering Mobile Apps that Always Work

By Wayne Carter, Ali LeClerc | November 10, 2015
In this webcast, you'll learn how to build a mobile app that has a consistent user experience, both online and offline.

http://cdn.oreillystatic.com/images/people/weblogs/eric_frenkiel1-50.jpgEric Frenkiel

Building Real-Time Data Pipelines through In-Memory Architectures

By Eric Frenkiel | November 10, 2015
In this webcast we will explore how organizations can build real-time data pipelines to process information at high speed, allowing your enterprise to make and act on decisions faster.

Reporting for Roomlia Using N1QL, The Query Language for NoSQL

By Vince Valenti, Keshav Murthy | November 05, 2015
Join Vince Valenti, CTO Roomlia as he discusses how Roomlia is using the Couchbase query language N1QL to dynamically extract data directly from Couchbase with less developer intervention.

http://cdn.oreillystatic.com/images/people/weblogs/bill_schmarzo2-50.jpgBill Schmarzo

Think like a data scientist: Build your big data blueprint

By Bill Schmarzo | November 05, 2015
During the session, Bill will show you how you can efficiently extract business value through insights gained from new and existing data sources.

http://cdn.oreillystatic.com/images/people/weblogs/bill_kornfeld-50.jpgBill Kornfeld

Event Analytics in Hadoop: Analyzing Cross-Channel Customer Behavior

By Bill Kornfeld | November 04, 2015
Join Dr. Bill Kornfeld, director of R&D for Think Big, a Teradata company, to explore how to build a cross channel event repository in Hadoop that sorts data from multiple channels by user, with both batch and real-time versions, each designed for...

From Monoliths to Microservices: How Yelp Changed Operations with Metrics and Analytics

By Sam Eaton, Karthik Rau | October 29, 2015
Sam Eaton, Director of Operations at Yelp, will discuss the operational side of Yelp's transition to microservices. Karthik Rau, CEO and co-founder at SignalFx, will discuss how SignalFx was built for the operational world created by DevOps and microservices...

http://cdn.oreillystatic.com/images/people/weblogs/salimah_addetia-50.jpgSalimah Addetia

Harnessing Metadata for Data Governance in Hadoop

By Salimah Addetia | October 22, 2015
In this webcast, we briefly outline the options available for data governance in Hadoop today and identify the advantages of protecting data using Accumulo.

http://cdn.oreillystatic.com/images/people/weblogs/peter_milne-50.jpgPeter Milne

Don’t Be Frightened by Moving to NoSQL

By Peter Milne | October 22, 2015
In this presentation, Peter Milne will take you through a comparison of NoSQL technologies.

The Race to Develop Mature SQL on Hadoop

By Hochan Won, Satish Sathiyavageswaran | October 20, 2015
In this webcast, learn how HP has taken the robust and complete Vertica query engine and opened it up to the Hadoop world.

Putting Mobile and Health Data to Work

By Ian Eslick, Roger Magoulas, Rob Rustad, Tuhin Sinha | October 14, 2015
Join Ian Eslick, Tuhin Sinha, Roger Magoulas and Rob Rustad (moderated by Roger) will share their experience of putting personal health data to work within a regulated hospital IT environment.

Managing the Data Lake: Creating Actionable Insights and Value

By Ben Lorica, Ben Sharma | October 13, 2015
During this webcast we will discuss a few key areas that will help ensure that you end up with a data lake that is carefully planned, governed, flexible and responsive to the ever-changing needs of your organization.

A business user’s guide to big data on Hadoop

By Ioana Hreninciuc, Andrew J Brust | September 23, 2015
This webcast will give an overview of deploying Hadoop within the organization as a strategic initiative for business advantage.

The New Database Era: The Convergence of Streaming Analytics with Transactions

By Scott Jarr, Ben Lorica | September 17, 2015
During this webcast you will learn the pros and cons of the various approaches used to create fast data applications.

Introduction to Tachyon and a deep dive into Baidu’s production use case

By Haoyuan Li, Shaoshan Liu | September 14, 2015
In this webcast, Haoyuan Li from Tachyon Nexus will present an overview of Tachyon, as well as some recent development and use cases. After that, Shaoshan Liu from Baidu will present their experience with Tachyon.

Improve Performance of Database-Backed Applications with a Geographically-Distributed Database

By Joe Lichtenberg, Don Pinto, Dominic Satur | September 10, 2015
In this presentation, you will learn how new database technologies have made it practical to run replicated, active database servers in multiple locations around the world, delivering better performance for online traffic patterns.

http://cdn.oreillystatic.com/images/people/weblogs/josh_rosen-50.jpgJosh Rosen

Deep dive into Project Tungsten: Bring Spark closer to bare metal

By Josh Rosen | September 03, 2015
In this talk, we will give an update on its progress and dive into some of the technical challenges we are solving.

http://cdn.oreillystatic.com/images/people/weblogs/tanya_schlusser.jpgTanya Schlusser

Get ready for the office football pool (using Python)

By Tanya Schlusser | September 01, 2015
Here's a simple look at predicting game outcomes and choosing fantasy players using Pandas, Scikit-Learn, and a couple of years of historical data.

Reach the Cloud with Big Data and Advanced Analytics

By Jennifer Reed, Camil Samaha | September 01, 2015
Join Novetta Director of Product Management Jennifer Reed and Amazon Web Services Manager, Solutions Architecture, Camil Samaha to learn how organizations can use their Big Data to answer critical business questions, make data-centric business decisions...

http://cdn.oreillystatic.com/images/people/weblogs/garrett_grolemund-1.jpgGarrett Grolemund

Easy, reproducible reports with R

By Garrett Grolemund | August 26, 2015
The R Markdown package makes it very easy to generate reports straight from your R code. This webcast will cover applying the same report to multiple data sets.

http://cdn.oreillystatic.com/images/people/weblogs/mahmoud_parsian-50.jpgMahmoud Parsian

Apache Spark Solution for Rank Product

By Mahmoud Parsian | August 25, 2015
In this webcast Mahmoud Parsian will present two distinct Spark solutions: (using groupByKey() and combineByKey()) for solving the rank product .

http://cdn.oreillystatic.com/images/people/weblogs/michael_minella-50.jpgMichael Minella

The connected car: An example of streaming real-time analytics

By Michael Minella | August 20, 2015
In this webcast session we will explore the power of Spring XD in the context of the Internet of Things (IoT).

Tame the firehose with Elasticsearch and Spark

By Anirudh Koul, Shashank Singh | August 12, 2015
We will discuss several aspects including design of search cluster, experimentation setup for performance tuning, learnings from cloud services, fault tolerance, monitoring, customer facing APIs, lowering costs and other best practices, to get the most...

Easy, real-time access to data with Apache Drill

By Matt Aslett, Piyush Bhargava, Jacques Nadeau, Steve Wooledge | July 30, 2015
In this panel discussion, Matt Aslett from 451 Research with an end-user and the Apache Drill architect, will explore the major role SQL-on-Hadoop technologies play in organizations and cover real implementation stories.

http://cdn.oreillystatic.com/images/people/weblogs/stephen_thomas-50.jpgStephen Thomas

Introduction to D3.js: Demystifying the challenges

By Stephen Thomas | July 29, 2015
This session introduces D3 starting with its underlying philosophy. We'll also walk through example code and see some of the nifty visualizations that no other JavaScript library can support.

http://cdn.oreillystatic.com/images/people/weblogs/randy_guck-50.jpgRandy Guck

Extending Cassandra with Doradus OLAP for High Performance Analytics

By Randy Guck | July 29, 2015
This webcast will introduce Doradus OLAP and cover topics such as: how Doradus OLAP extends Cassandra from the outside, techniques used for achieving extreme compression, and the Doradus data model including its support for bi-directional relationships...

Integrating Customer Data at Scale

By Alan Wagner, Matt Stevens | July 28, 2015
Join Toyota Motor Europe General Manager Matt Stevens and Tamr field engineer Alan Wagner for this webinar to see how leading enterprise are radically simplifying the construction of a comprehensive, 360-degree view of their customers.

http://cdn.oreillystatic.com/images/people/weblogs/mahmoud_parsian-50.jpgMahmoud Parsian

All-vs-all: Efficient correlation using Spark/Hadoop

By Mahmoud Parsian | July 23, 2015
The webcast covers Pearson and Spearman correlations implemented in Spark/Hadoop.

http://cdn.oreillystatic.com/images/people/weblogs/patrick_wendell-50.jpgPatrick Wendell

Apache Spark 1.4 presented by Databricks co-founder Patrick Wendell

By Patrick Wendell | July 08, 2015
In this webcast, Patrick Wendell from Databricks will be speaking about Spark's new 1.4 release.

http://cdn.oreillystatic.com/images/people/weblogs/james_taylor1-50.jpgJames Taylor

Apache Phoenix: The evolution of a relational database layer over Hbase

By James Taylor | June 25, 2015
This webcast will begin by giving a State of the Union of Apache Phoenix, a relational database layer on top of HBase for low latency applications, with a brief overview of new and existing features.

http://cdn.oreillystatic.com/images/people/weblogs/john_hugg1-50.jpgJohn Hugg

Building a Fast Data Front End for Hadoop

By John Hugg | June 24, 2015
During this webcast you will learn the pros and cons of the various approaches used to create fast data applications.

Data Governance: Building the Foundation for Analytics

By Jennifer Reed, Henry Mlodozeniec | June 18, 2015
In this webcast, we will discuss the following: the integral role the four pillars of Data Governance play in your strategy: data privacy, data quality, process integration, data lifecycle management.

http://cdn.oreillystatic.com/images/people/weblogs/tyler_hannan-50.jpgTyler Hannan

Simplicity Scales - Big Data Application Management & Operations

By Tyler Hannan | June 16, 2015
In this presentation, Tyler Hannan discuss how you can simplify the management of the technologies required to support your Big Data applications and give practical considerations to make when choosing the right tools for the job.

http://cdn.oreillystatic.com/images/people/weblogs/rich_morrow-50.jpgRich Morrow

Big data consultant Rich Morrow introduces Apache Hadoop

By Rich Morrow | May 21, 2015
In this fast-paced, interactive, info-rich webcast led by Rich Morrow, we'll discuss: the value proposition of Hadoop, common Use cases, and how to run it on premises and in the cloud via AWS EMR.

http://cdn.oreillystatic.com/images/people/weblogs/alois_reitbauer2-50.jpgAlois Reitbauer

Anomaly Detection and Self-Learning Monitoring Systems

By Alois Reitbauer | May 14, 2015
This webcast will discuss the latest developments in building intelligent monitoring systems.

http://cdn.oreillystatic.com/images/people/weblogs/boris_adryan-50.jpgBoris Adryan

Organizing the Internet of Things

By Boris Adryan | May 07, 2015
In this webcast I aim to introduce the three main branches localization, function and process that we use in GO and demonstrate how they're immediately applicable in the IoT — after all, a cell is just a large, interconnected system.

Move Your Enterprise from a Table Centric View to an Entity Centric View

By Kris Heim, Jennifer Reed, Carl Zmola | May 07, 2015
You have successfully stored large amounts of raw data into Hadoop for advanced analytics. But now what? How do you analyze this data from a perspective that makes the data meaningful to provide actionable insight?

Creating Meaningful Metrics That Get Your Users to do the Things You Want

By Buddy Brewer, Steve Souders, Mark Zeman | April 23, 2015
In this session, Buddy Brewer from SOASTA, along with Mark Zeman and Steve Souders from SpeedCurve, turn the tables on Big Data and illustrate how identifying and focusing deeply on a few meaningful metrics facilitates far better decision making.

http://cdn.oreillystatic.com/images/people/weblogs/hari_shreedharan-50.jpgHari Shreedharan

Using Flume: Integrating Flume with Hadoop, HBase and Spark

By Hari Shreedharan | April 22, 2015
In this webcast, Hari Shreedharan, the author of Using Flume will discuss how to use Flume to write data to HDFS, HBase and Spark.

Women in Data: Their Work and Achievements

By Michele Chambers, Cornelia Lévy-Bencheton, Renetta Tull, Alice Zheng, Laurie Skelly | April 16, 2015
Join us in this webcast in which female data practitioners discuss their work, their achievements, and the attitudes that have propelled them forward to career success.

News from Scikit-Learn 0.16 and Soon-To-Be Gems for the Next Release

By Olivier Grisel, Andreas C Müller | April 02, 2015
This webcast will review Scikit-learn, a widely used open source machine learning library in python, and discuss some of the new features of the recent 0.16 release.

http://cdn.oreillystatic.com/images/people/weblogs/kay_ousterhout-50.jpgKay Ousterhout

Making Sense of Spark Performance

By Kay Ousterhout | April 01, 2015
In this talk, I'll take a deep dive into Spark's performance on two benchmarks (TPC-DS and the Big Data Benchmark from UC Berkeley) and one production workload and demonstrate that many commonly-held beliefs about performance bottlenecks do not hold.

http://cdn.oreillystatic.com/images/people/weblogs/patrick_wendell-50.jpgPatrick Wendell

Apache Spark 1.3 and Spark’s New Dataframe API

By Patrick Wendell | March 25, 2015
In this webcast, Patrick Wendell from Databricks will be speaking about Spark's new 1.3 release.

Entity Resolution on Hadoop: The Pitfalls of Building It Yourself

By Dave Moore, Jennifer Reed | March 24, 2015
In this webcast, hear from a solutions architect and a product manager who talk to organizations every day about Hadoop and their entity resolution and analysis requirements.

Understanding SQL on Hadoop and Distributed R

By Steve Sarsfield, Sunil Venkayala | March 10, 2015
In this webcast, you'll learn: leveraging multiple nodes for predictive analytics to vastly improve performance, performing R analysis on larger data sets while overcoming scalability limitations of R, and using HP Haven on Hadoop capabilities to perform...

http://cdn.oreillystatic.com/images/people/weblogs/andy_palmer-50.jpgAndy Palmer

Taming Data Variety: Intelligent Solutions Using Machine Learning and Expert Crowdsourcing

By Andy Palmer | March 05, 2015
During a 30-minute webinar, join data-industry veteran Andy Palmer as he discusses how enterprise organizations are leveraging new approaches to delivering the cleanest, widest view of data to downstream analytic tools.

http://cdn.oreillystatic.com/images/people/weblogs/dipti_borkar-50.jpgDipti Borkar

Mission Critical NoSQL

By Dipti Borkar | March 03, 2015
In this webinar, Dipti Borkar, Sr. Director of Solutions Engineering at Couchbase, will give a brief overview of Couchbase Server, a document database and its underlying distributed architecture.

http://cdn.oreillystatic.com/images/people/weblogs/anirudh_todi-50.jpgAnirudh Todi

TimeSeries AggregatoR

By Anirudh Todi | February 21, 2015
In this webcast I'll introduce TSAR (the TimeSeries AggregatoR), a robust, flexible, and scalable service for real-time event aggregation designed to solve this problem and a range of similar ones.

http://cdn.oreillystatic.com/images/people/weblogs/chris_twogood-50.jpgChris Twogood

Modular Apps: The Building Blocks of Big Data Business Value

By Chris Twogood | February 12, 2015
Attend this session to hear about new solutions and methodologies to quickly build, deploy and share big data apps for faster time to value.

http://cdn.oreillystatic.com/images/people/weblogs/alistair_croll-50.jpgAlistair Croll

2015 Data Preview: Spark, Data Visualization, YARN, and More

By Alistair Croll | February 04, 2015
Get a sneak peek with this free online conference, featuring many of Strata's most sought-after speakers and hottest topics.

http://cdn.oreillystatic.com/images/people/weblogs/adam_marcus-50.jpgAdam Marcus

Crowdsourcing at GoDaddy: How I Learned to Stop Worrying and Love the Crowd

By Adam Marcus | January 22, 2015
During the webcast, Adam Marcus will highlight how to build human-machine hybrids and benefit from active learning workflows.

http://cdn.oreillystatic.com/images/people/weblogs/stephen_elston-50.jpgStephen Elston

Data Science in the Cloud with Microsoft Azure Machine Learning and R

By Stephen Elston | January 20, 2015
The Microsoft Azure Machine Learning cloud platform provides simplified yet powerful data management, transformation and machine learning tools.

http://cdn.oreillystatic.com/images/people/weblogs/lorne_lantz-50.jpgLorne Lantz

Intro to Bitcoin: What is Bitcoin and How Does it Work?

By Lorne Lantz | January 14, 2015
Lorne will share with you the values bitcoin can bring to the world and how the innovative technology works.

http://cdn.oreillystatic.com/images/people/weblogs/patrick_wendell-50.jpgPatrick Wendell

Apache Spark 1.2 and Beyond!

By Patrick Wendell | January 13, 2015
In this webcast, Patrick Wendell from Databricks will be speaking about Spark's new 1.2 release.

http://cdn.oreillystatic.com/images/people/weblogs/pamela_pavliscak.jpgPamela Pavliscak

Data-Informed Design

By Pamela Pavliscak | January 08, 2015
This webcast walks through how to identify the data that means the most to user experience and how to use it to make smart decisions about design.

http://cdn.oreillystatic.com/images/people/weblogs/jason_krol-50.jpgJason Krol

Rapid Prototyping Web Applications Using Node.js and MongoDB

By Jason Krol | January 07, 2015
In this hands-on webcast you will learn how to leverage Node.js and Express.js to quickly bootstrap a web server, tie in MongoDB to persist data, and display it all using dynamic HTML templates.

http://cdn.oreillystatic.com/images/people/weblogs/tim_swanson-50.jpgTim Swanson

Moving Beyond Bitcoin (BINO) Beta

By Tim Swanson | January 07, 2015
This webcast presentation discusses several proposed solutions to the challenges currently being devised by a multitude of teams.

http://cdn.oreillystatic.com/images/people/weblogs/andreas_antonopoulos.jpgAndreas Antonopoulos

Bitcoin and the Future of Money

By Andreas Antonopoulos | December 17, 2014
Join this webcast to learn what bitcoin is, what makes it special, how to get it and how to use it.

http://cdn.oreillystatic.com/images/people/weblogs/rod_smith-50.jpgRod Smith

Next Gen Leaders Set Pace For New Wave of Solutions

By Rod Smith | December 09, 2014
This talk will focus on new approaches​ ​for​ ​business leaders​ ​looking to harness data solutions, tools and platforms for building robust business...

http://cdn.oreillystatic.com/images/people/weblogs/john_russell-2.jpgJohn Russell

Getting Started with Impala - Interactive SQL for Apache Hadoop

By John Russell | December 04, 2014
You can write, tune, and port SQL queries and other statements for a Big Data environment using Impala, the open source, MPP SQL query engine for Apache Hadoop.

http://cdn.oreillystatic.com/images/people/weblogs/kieren_james-lubin-50.jpgKieren James-Lubin

The Future of Bitcoin: A Data-Driven Perspective

By Kieren James-Lubin | December 03, 2014
Taking a holistic, data-driven perspective, Kieren James-Lubin will project where Bitcoin might be in a decade.

Data Hiding — A look at the Latest Techniques and Countermeasures

By Michael Raggo, Chet Hosmer | November 18, 2014
This webcast will highlight some of the latest research of the 21st century involving data hiding techniques over the network and with data-at-rest.

http://cdn.oreillystatic.com/images/people/weblogs/sameer_farooqui-50.jpgSameer Farooqui

Spark + Cassandra: Technical Integration Details

By Sameer Farooqui | November 12, 2014
This webcast will cover an architecture deep dive around how the Apache Cassandra database integrates with the Apache Spark computation engine.

http://cdn.oreillystatic.com/images/people/weblogs/matthew_kirk-50.jpgMatthew Kirk

Thoughtful Machine Learning: Sentiment Analysis Using Support Vector Machines in Ruby

By Matthew Kirk | November 11, 2014
Join us for this webcast where we'll go detecting sentiment in tweets using support vector machines.

http://cdn.oreillystatic.com/images/people/weblogs/mark_harwood-50.jpgMark Harwood

Revealing the Uncommonly Common with Elasticsearch

By Mark Harwood | October 30, 2014
This webcast will be discussing how Elasticsearch is taking search engine technology and branching it out from its roots in relevance-ranking search results to providing more insightful analysis of large datasets.

http://cdn.oreillystatic.com/images/people/weblogs/brett_rudenstein-50.jpgBrett Rudenstein

Get More Value out of Multiple Hadoop Data Centers

By Brett Rudenstein | October 23, 2014
In this webcast, we'll examine how to get the most out of your multi-data center Hadoop investment.

http://cdn.oreillystatic.com/images/people/weblogs/luis_pedro_coelho-50.jpgLuis Pedro Coelho

Penalized Linear Regression in Python

By Luis Pedro Coelho | October 22, 2014
In this webcast, learn how to use Ipython notebooks and scikit-learn to explore a dataset with different forms of regression and how to choose between them for your specific problem.

http://cdn.oreillystatic.com/images/people/weblogs/andre_arko-50.jpgAndré Arko

Lies, Damn Lies, and Metrics

By André Arko | October 22, 2014
This webcast talk covers common blind spots in instrumentation and metrics setups. It also tries to help inculcate a mindset that cares about real-world results rather than the dubiously accurate numbers showing up on a web page.

Hadoop Means Business: The Changing Role of Hadoop in Business Outcomes

By Karthik Kulkarni, Farid Jiandani | October 14, 2014
You will hear how the performance team at Philips improved the experience of their mobile website customers.

Beating Billion Dollar Fraud Using Anomaly Detection

By Ian Howells, Billie Rinaldi, Arshak Navruzyan | October 08, 2014
This presentation will review the approach Argyle Data has taken to develop a real-time fraud analytics application using anomaly detection at scale building on open source technology developed at the NSA (Accumulo) and Facebook (Prestodb) on the Hortonworks...

http://cdn.oreillystatic.com/images/people/weblogs/hakim_el_fartasi2-50.jpgHakim el Fartasi

Connecting Your Business to Web Performance: The Philips Journey

By Hakim el Fartasi | October 02, 2014
You will hear how the performance team at Philips improved the experience of their mobile website customers.

http://cdn.oreillystatic.com/images/people/weblogs/patrick_wendell-50.jpgPatrick Wendell

Spark 1.1 and Beyond!

By Patrick Wendell | October 02, 2014
In this webcast, Patrick Wendell from Databricks will be speaking about Spark's new 1.1 release.

http://cdn.oreillystatic.com/images/people/weblogs/amanda_kahlow-50.jpgAmanda Kahlow

Becoming a Utopian Data-Driven Enterprise: Lessons From the Early Adopters of Predictive Intelligence

By Amanda Kahlow | October 01, 2014
In this webcast led by Amanda Kahlow we'll discuss how companies - specifically their sales and marketing team - today are evolving their processes to accommodate a new reality, where data replaces guesswork?

http://cdn.oreillystatic.com/images/people/weblogs/karl_pover-50.jpgKarl Pover

Learning QlikView Datavisualization

By Karl Pover | September 23, 2014
In this webcast, let's review how easy it is to take advantage of QlikView's associative data model and learn the power of data discovery so we can avoid the grave mistake of mimicking spreadsheets and static reports in QlikView.

http://cdn.oreillystatic.com/images/people/weblogs/chuck_yarbrough-50.jpgChuck Yarbrough

Building a Data Refinery

By Chuck Yarbrough | September 23, 2014
Join this conversation with Ben Lorica from O'Reilly Media and Chuck Yarbrough from Pentaho as they discuss the merits and advantages of a big data refinery, and learn for yourself if this emerging architecture is right for your organization.

http://cdn.oreillystatic.com/images/people/weblogs/flavio_junqueira-50.jpgFlavio Junqueira

Apache ZooKeeper and The Art of Building Distributed Systems

By Flavio Junqueira | September 17, 2014
In this webcast we'll cover: some basic concepts of ZooKeeper, design choices, and caveats, examples of recipes one can implement with the ZooKeeper API, elaborate on the potential problems with naïve recipes, and discuss ways around such problems...

http://cdn.oreillystatic.com/images/people/weblogs/hunter_whitney-50.jpgHunter Whitney

It’s About Time: Using Temporal Visualization Techniques to Give Data More Meaning and Context

By Hunter Whitney | September 16, 2014
This webcast will include key ideas, techniques, and practical applications to represent and explore event sequences and their temporal patterns.

http://cdn.oreillystatic.com/images/people/weblogs/alistair_croll-50.jpgAlistair Croll

Putting Data to Work

By Alistair Croll | September 16, 2014
In this free preview event, we'll look at a cross-section of the New York lineup. Join some of Strata's keynotes and most sought-after presenters for a preview of what they'll be discussing this October.

http://cdn.oreillystatic.com/images/people/weblogs/brett_lantz-50.jpgBrett Lantz

Understanding Complexity by Clustering Data with Machine Learning and R

By Brett Lantz | August 28, 2014
In this webcast, we'll explore how R's open-source clustering packages can discover patterns in large data and inform our understanding of complex sociological phenomena such as teenage identities.

http://cdn.oreillystatic.com/images/people/weblogs/ben_jones.jpgBen Jones

Communicating Data with Tableau

By Ben Jones | August 26, 2014
In this webcast, O'Reilly author Ben Jones will use an example from his new book, Communicating Data with Tableau, in which the growth of the internet worldwide is visualized in multiple ways.

http://cdn.oreillystatic.com/images/people/weblogs/lukas_biewald-50.jpgLukas Biewald

Real-world Active Learning

By Lukas Biewald | August 21, 2014
Machine learning research is often not applied to real world situations. Often the improvements are small and the increased complexity is high, so except in special situations, industry doesn't take advantage of advances in the academic literature.

http://cdn.oreillystatic.com/images/people/weblogs/jeroen_janssens-50.jpgJeroen Janssens

Data Science at the Command Line

By Jeroen Janssens | August 20, 2014
Whether you're entirely new to the command line or already dream in shell scripts, by the end of this webcast you will have a solid understanding of how to leverage the power of the command line for your next data science project.

http://cdn.oreillystatic.com/images/people/weblogs/olivier_grisel-50.jpgOlivier Grisel

What’s New in Scikit-learn 0.15 and What’s Cooking in the Development Branch?

By Olivier Grisel | August 13, 2014
This webcast will introduce scikit-learn, an Open Source project for Machine Learning in Python and review some new features from the recent 0.15 release such as faster randomized ensemble of decision trees and optimization for the memory usage when ...

http://cdn.oreillystatic.com/images/people/weblogs/alex_bordei-50.jpgAlex Bordei

Getting the Most Out of Your NoSQL DB

By Alex Bordei | August 07, 2014
In this webcast Alex Bordei will look at how Impala, Elasticsearch and Couchbase perform when scaled vertically and horizontally, over a number of different bare metal setups.

http://cdn.oreillystatic.com/images/people/weblogs/donald_miner-50.jpgDonald Miner

An introduction to Apache Accumulo

By Donald Miner | August 05, 2014
This webcast will cover the basics of Apache Accumulo architecture and how it works, along with examples of how it is used.

Up Your R Game

By Bill Franks, James Taylor | July 29, 2014
This webcast discusses requirements for R as it evolves into a big data and enterprise-analytic solution and presents a novel approach to make Open source R massively scalable, reliable, and easy to use.

http://cdn.oreillystatic.com/images/people/weblogs/pete_warden2-50.jpgPete Warden

How to Get Started with Deep Learning in Computer Vision

By Pete Warden | July 24, 2014
In this webcast Pete Warden will walk through some popular open-source tools from the academic world, and show you step-by-step how to process images with them.

http://cdn.oreillystatic.com/images/people/weblogs/brian_bulkowski-50.jpgBrian Bulkowski

TPS & TB - When to Scale Up and How to Scale Out

By Brian Bulkowski | July 23, 2014
This webcast will describe a number of real-time big data driven applications, emerging data management architectures and the increasing need for applications to handle mixed read/write workloads as well as transactions and analytics on hot data.

http://cdn.oreillystatic.com/images/people/weblogs/sarah_guido-50.jpgSarah Guido

Analyzing Data with Python

By Sarah Guido | July 09, 2014
In this webcast led by Sarah Guido, you'll get a bird's eye overview of some of the best tools for data analysis and how you can apply them to your workflow.

http://cdn.oreillystatic.com/images/people/weblogs/jodok_batlogg-50.jpgJodok Batlogg

Super Simple Real-Time Big Data Backend: Crate Data

By Jodok Batlogg | July 08, 2014
In this webcast we will demonstrate, step-by-step example how a web service can be deployed with the full service stack (data and application) on a single node and then add nodes as needed just by starting them.

Big Data, Fast Data: The Need for In-Memory Database Technology

By Michael Stonebraker, Scott Jarr | June 25, 2014
In this webcast, Scott Jarr, co-founder and chief strategy officer at VoltDB, will discuss the new corporate data architecture — and the necessary technology components for facing this data management challenge.

http://cdn.oreillystatic.com/images/people/weblogs/yves_hilpisch-1.jpgYves Hilpisch

Derivatives Analytics with Python

By Yves Hilpisch | June 24, 2014
In this webcast you will learn how Python can be used for Derivatives Analytics and Financial Engineering.

http://cdn.oreillystatic.com/images/people/weblogs/alice_zheng1-50.jpgAlice Zheng

Scalable Data Science on a Laptop

By Alice Zheng | June 24, 2014
In this webcast, we'll demonstrate doing scalable data science using GraphLab Create, an end-to-end platform for prototyping and deploying data products.

http://cdn.oreillystatic.com/images/people/weblogs/elliot_williams-50.jpgElliot Williams

Building your Own USB Devices for AVR with the V-USB Library

By Elliot Williams | June 20, 2014
This webcast will walk you through two example projects: a custom scrollwheel mouse and a USB temperature controller.

http://cdn.oreillystatic.com/images/people/weblogs/mikio_braun-50.jpgMikio Braun

Data Analysis on Streams

By Mikio Braun | June 12, 2014
In this webcast, Mikio Braun will discuss building reliable and efficient solutions for real-time data analysis, including approaches that rely on scaling--both batch-oriented (such as MapReduce), and stream-oriented (such as Apache Storm and Apache ...

http://cdn.oreillystatic.com/images/people/weblogs/paco_nathan.jpgPaco Nathan

Computational Thinking: Just Enough Math

By Paco Nathan | June 04, 2014
In the webcast, we'll review some of the historical context that led to machine learning techniques.

http://cdn.oreillystatic.com/images/people/weblogs/florian_haas-50.jpgFlorian Haas

Hands On Trove: Database as a Service in OpenStack

By Florian Haas | May 21, 2014
In this webcast led by Florian Haas, he'll cover the architecture of Trove, and demonstrates the deployment of OpenStack Trove on an OpenStack private cloud, in order to provide MySQL DBaaS to OpenStack users.

http://cdn.oreillystatic.com/images/people/weblogs/jay_kreps1-50.jpgJay Kreps

I ♥ Logs: Apache Kafka and Real-time Data Integration

By Jay Kreps | May 21, 2014
This webcast talk will discuss how logs and stream-processing can form a backbone for data flow, ETL, and real-time data processing.

http://cdn.oreillystatic.com/images/people/weblogs/michael_armbrust-50.jpgMichael Armbrust

Performing Advanced Analytics on Relational Data with Spark SQL

By Michael Armbrust | April 29, 2014
In this webcast, we'll examine Spark SQL, a new Alpha component that is part of the Apache Spark 1.0 release.

http://cdn.oreillystatic.com/images/people/weblogs/michael_collins-50.jpgMichael Collins

Before the Math: Detecting Security Issues Using Exploratory Data Analysis

By Michael Collins | April 24, 2014
In this webcast, we'll discuss how to apply the art of exploratory data analysis to security questions.

http://cdn.oreillystatic.com/images/people/weblogs/lutz_finger.jpgLutz Finger

Ask-Measure-Learn to Gain Actionable Insights from Your Big Data

By Lutz Finger | April 03, 2014
This webcast shows how to extract significant business value from big data with Ask-Measure-Learn, a system that helps you ask the right questions, measure the right data, and then learn from the results.

http://cdn.oreillystatic.com/images/people/weblogs/mark_larosa-50.jpgMark LaRosa

Better, Faster Business Analytics with In-memory Databases

By Mark LaRosa | April 02, 2014
In this webcast, we will look at the benefits of in-memory technology and the business value it brings. Attendees will also have a chance to see the speed, scale, and simplicity of MemSQL’s in-memory solution, and why it is evolving ...

http://cdn.oreillystatic.com/images/people/weblogs/tricia_wang_50.jpgTricia Wang

Why Big Data Needs Thick Data

By Tricia Wang | March 28, 2014
Big Data can help predict the future, but can too much Big Data be dangerous for your organization? Yes, says global tech ethnographer, Tricia Wang. This webcast examines the risks of over-reliance on big data and the need to bring in Thick Data&mdash...

http://cdn.oreillystatic.com/images/people/weblogs/max_shron-50.jpgMax Shron

Thinking with Data

By Max Shron | March 13, 2014
This webcast examines a framework for incorporating ideas from other fields (like design, argument studies, and consulting) into Data Science.

http://cdn.oreillystatic.com/images/people/weblogs/adam_kawa-50.jpgAdam Kawa

Hadoop Adventures at Spotify

By Adam Kawa | February 27, 2014
In this webcast talk led by Adam Kawa, we will talk about our real-world Hadoop issues that either broke our cluster or made it very unstable, especially when we were growing very fast from a 60 to 690-node Hadoop cluster.

http://cdn.oreillystatic.com/images/people/weblogs/karen_hsu-50.jpgKaren Hsu

Instant Visualization in Every Step of Analysis

By Karen Hsu | February 27, 2014
In this webcast, we'll discuss how IT and business users can leverage self-service visualizations to quickly spot and correct data anomalies throughout the analytic process.

http://cdn.oreillystatic.com/images/people/weblogs/ken_gleason-50.jpgKen Gleason

Data Quality Demystified: Knowing When Your Data is Good Enough

By Ken Gleason | February 13, 2014
This webcast introduces a simple conceptual framework for thinking about data quality and strategies for evaluating quality proactively to improve results and reduce unnecessary repetition.

How to Get Statistics Right in AB Testing: The Short Answer

By Zack Exley, Sahar Massachi | February 05, 2014
In this webcast talk we'll present simple methods that we believe accurately predict future performance from AB test results, and that allow us to determine the smallest acceptable sample size. Using four years of AB testing data, we'll show that these...

http://cdn.oreillystatic.com/images/people/weblogs/alistair_croll-50.jpgAlistair Croll

Data Everywhere: Data Anthropology, Quantified Self, Machine Data, Human Centered Design, and more

By Alistair Croll | February 04, 2014
In this free online conference, we'll be showcasing some of the hot topics and thought-provoking speakers who will be joining us for the event. It's your chance to see what we're covering and to find those can't-miss tracks and sessions.

Predictive Analytics, Machine Learning, and Recommendation Systems on Hadoop

By Wayne Thompson, Georgia Mariani | January 30, 2014
Join us to learn more about how to reveal insights in your Big data and redefine how your organization solves complex problems.

http://cdn.oreillystatic.com/images/people/weblogs/andrew_collette-50.jpgAndrew Collette

Managing Large Datasets with Python and HDF5

By Andrew Collette | January 28, 2014
This webcast provides a practical, Python-based introduction to the world of HDF5.

http://cdn.oreillystatic.com/images/people/weblogs/paco_nathan.jpgPaco Nathan

Getting Started Running Apache Spark on Apache Mesos

By Paco Nathan | January 24, 2014
This tutorial shows a simple way to launch a Mesos cluster in the cloud, how to configure run Spark on Mesos, then how to run jobs in Spark.

The End of the Analytics Black Box

By Steven Hillion, Joel Horwitz | January 23, 2014
Please join Steven Hillion, Alpine Chief Product Officer, and Joel Horwitz, Alpine Head of Product Marketing, as they share the story of how to and take advanced analytics out of the black box and into the hands of every decision maker in your organization...

http://cdn.oreillystatic.com/images/people/weblogs/joy_beatty.jpgJoy Beatty

Forward Thinking for Tomorrow's Projects: Requirements for Business Analytics

By Joy Beatty | January 23, 2014
In this webcast presentation, Joy Beatty, VP of R&D at Seilevel, offers advice on tackling requirements for business analytics projects. Drawing from the book she co-authored with Karl Wiegers, Software Requirements 3rd Ed., Joy will outline how ...

http://cdn.oreillystatic.com/images/people/weblogs/trent_hauck-50.jpgTrent Hauck

A Detailed Look at Pandas' Indexes

By Trent Hauck | January 22, 2014
Join Trent Hauck author of Instant Data Intensive Apps with Pandas How-to, for a hands-on webcast where he will discuss motivations for using indexed data structures over non-indexed data structures in pandas.

http://cdn.oreillystatic.com/images/people/weblogs/scott_murray1-50.jpgScott Murray

From Scattered to Scatterplot in 2 Hours: An Introduction to d3.js

By Scott Murray | January 08, 2014
Confused by D3? Interested in coding data visualizations on the web, but don't know where to start? This online tutorial will have you transforming data into visual images in no time at all, starting from scratch and building an interactive scatterplot...

http://cdn.oreillystatic.com/images/people/weblogs/alistair_croll-50.jpgAlistair Croll

Data, Crime, and Conflict

By Alistair Croll | January 07, 2014
Join a lineup of thinkers and technologists for this free online event as we look at the ways data is shaping how we police ourselves, from technological innovations to ethical dilemmas.

http://cdn.oreillystatic.com/images/people/weblogs/matthew_a_russell.jpgMatthew Russell

Data Science Experiments with Twitter and IPython Notebook

By Matthew Russell | December 13, 2013
After attending this mini-workshop, you'll be able to run your own data science experiments with Twitter's API and IPython Notebook! Besides learning the fundamentals of how to use IPython Notebook, you'll learn how to do the following kinds of things...

http://cdn.oreillystatic.com/images/people/weblogs/scott_murray1-50.jpgScott Murray

Whatever Happened to "Augmenting Human Intellect"?

By Scott Murray | November 20, 2013
Join us for an interactive webcast presented by Scott Murray where we explore the fundamental role of data visualization is to express information in a form more palatable to human perception than rows and columns of raw values.

http://cdn.oreillystatic.com/images/people/weblogs/lynwood_bishop-50.jpgLynwood Bishop

Using Every Pixel to Visualize Big Data

By Lynwood Bishop | November 08, 2013
Visualizing patterns, relationships and anomalies in multi-sourced data is challenging when the number of records continues to grow exponentially. Many traditional methods of visualization for business intelligence and reporting aggregate the results...

http://cdn.oreillystatic.com/images/people/weblogs/michael_shoffner-50.jpgMichael Shoffner

Turning Bigger Data Into Better Healthcare

By Michael Shoffner | November 07, 2013
This webcast presentation paints a picture of the direction clinical medicine is heading in the age of Big Data, highlighting ongoing data cyberinfrastructure development by University of North Carolina at Chapel Hill based RENCI and key partners to ...

http://cdn.oreillystatic.com/images/people/weblogs/alistair_croll-50.jpgAlistair Croll

Data and Ethics: Etiquette and Law for an Always-On World

By Alistair Croll | November 05, 2013
In this online conference, we'll look at where ethics and the law are headed in an always-on, data-driven society.

http://cdn.oreillystatic.com/images/people/weblogs/susan_etlinger-50.jpgSusan Etlinger

Canary in the Coalmine: How Social Data Can Prepare Us for Big Data

By Susan Etlinger | October 22, 2013
Modern data analysis requires that you have two jobs: being a statistician and being a programmer. This is especially true with R, where pointing and clicking to analyze data is mostly not an option. Fortunately, the jump from writing code like a statistician...

http://cdn.oreillystatic.com/images/people/weblogs/matthew_a_russell.jpgMatthew Russell

Why Twitter Is All the Rage: A Data Miner's Perspective

By Matthew Russell | October 15, 2013
In order to be successful, technology must amplify a meaningful aspect of our human experience, and Twitter’s success largely has been dependent on its ability to do this quite well. Although you could describe Twitter as just a “...

http://cdn.oreillystatic.com/images/people/weblogs/marc_garrett-50.jpgMarc Garrett

Fitter, Happier: Improve Your Health and Productivity with R

By Marc Garrett | October 15, 2013
Intridea is famous for our distributed team. We believe that letting people work from home leads to happier employees and better client outcomes. But there's one drawback: the freshman fifteen! Working from home means working close to your refrigerator...

Real-time Stream Processing and Visualization Using Kafka, Storm, and d3.js

By Byron Ellis, Justin Langseth | October 10, 2013
In this hands-on webcast you'll learn how LivePerson and Zoomdata perform stream processing and visualization on mobile devices of structured site traffic and unstructured chat data in real-time for business decision making.

http://cdn.oreillystatic.com/images/people/weblogs/kim_rees-50.jpgKim Rees

Best of Strata + Hadoop World 2012: How to See Data

By Kim Rees | October 09, 2013
Join us for an exclusive presentation by Kim Reese recorded live at Strata + Hadoop World 2012

http://cdn.oreillystatic.com/images/people/weblogs/carl_steinbach-50.jpgCarl Steinbach

The Best of Strata Santa Clara 2013: SQL on Hadoop

By Carl Steinbach | October 02, 2013
In this talk we will discuss the unavoidable cost and performance limitations of the connector-based approach employed by many established vendors and explain the long-term significance of Apache Hive's data model along with its influence on next generation...

http://cdn.oreillystatic.com/images/people/weblogs/alistair_croll-50.jpgAlistair Croll

Big Data and the Ethics and Challenges of Living in a Connected Society

By Alistair Croll | September 27, 2013
In late October, Strata+Hadoop World returns to the Big Apple. This year, we have a wide range of topics, from real-world case studies to hard-core data science to the ethics and challenges of a connected society.

http://cdn.oreillystatic.com/images/people/weblogs/richard_cotton.jpgRichie Cotton

Writing Great R Code

By Richie Cotton | September 25, 2013
Modern data analysis requires that you have two jobs: being a statistician and being a programmer. This is especially true with R, where pointing and clicking to analyze data is mostly not an option. Fortunately, the jump from writing code like a statistician...

http://cdn.oreillystatic.com/images/people/weblogs/paco_nathan.jpgPaco Nathan

Enterprise Data Workflows with Cascading

By Paco Nathan | September 17, 2013
In this hands-on webcast presented by Paco Nathan author of Enterprise Data Workflows with Cascading, he will discuss what defines a workflow , in contrast to notions of dataflow and the impact that has on the tools required.

Anonymizing Health Data

By Luk Arbuckle, Khaled El Emam | September 13, 2013
In this webcast we'll start with a discussion of the relatively simple de-identification of a cross-sectional disease registry, and then we'll jump in to more complex situations like the de-identification of longitudinal data, free-form text, and geospatial...

http://cdn.oreillystatic.com/images/people/weblogs/sreedhar_potarazu-50.jpgSreedhar Potarazu

Why Facebook and Google Missed the Boat on Healthcare

By Sreedhar Potarazu | September 04, 2013
In this webcast talk Dr. Sreedhar Potarazu (Dr P), Fox News Contributor, Acclaimed Author and Nationally Recognized Expert on Big Data and Healthcare, tells the story never told on the next Big THING in healthcare and the lessons learned from Silicon...

http://cdn.oreillystatic.com/images/people/weblogs/robert_grossman-50.jpgRobert Grossman

Community Clouds for Cancer Genomics: Lessons Learned from Bionimbus

By Robert Grossman | August 20, 2013
Join us for a webcast talk by Robert Grossman where he shares how his organization recently expanded Bionimbus so that researchers can analyze data from controlled datasets, such as The Cancer Genome Atlas (TCGA) in a secure and compliant fashion.

http://cdn.oreillystatic.com/images/people/weblogs/john_kansky-50.jpgJohn Kansky

Best of Strata Rx 2012: HIE 2.0 - The Future of Health Information Exchange

By John Kansky | August 09, 2013
This is an exclusive session presented by John Kansky recorded live at Strata Rx 2012.

http://cdn.oreillystatic.com/images/people/weblogs/ilya_grigorik-50.jpgIlya Grigorik

Best of Strata + Hadoop World 2012: Analyzing Millions of GitHub Commits

By Ilya Grigorik | August 08, 2013
In this session, we will discuss our experience in using BigQuery, how we modeled the GitHub event data, and the lessons learned in importing and making the data available.

Best of Strata Rx 2012: Doing Big Data All By Yourself

By Lauren Chaparro, Ari Gesher | July 30, 2013
In this presentation, we will show a working system that bridges the gap between data analysis and decision making using a carefully composed set of big-data technologies mated with an interactive, high-level interface.

http://cdn.oreillystatic.com/images/people/weblogs/shahid_shah-50.jpgShahid Shah

Best of Strata Rx 2012: Reasons why health data is poorly integrated today and what we can do about it

By Shahid Shah | July 03, 2013
In this talk Shahid N. Shah will look at the specific things that are holding us back when it comes to poor integration in healthcare and what future EHRs can do about it.

http://cdn.oreillystatic.com/images/people/weblogs/jennifer_van_der_meer-50.jpgJen van der Meer

The Best of Strata Santa Clara 2013: Data is Not a Business Model

By Jen van der Meer | July 02, 2013
This talk will help anyone who is tasked with determining how to get more business action out of data.

Best of Strata Rx 2012: Disruptors: What Healthcare Will Look Like In 2020

By John Mattison, Tim O'Reilly, DJ Patil, Benjamin West | June 28, 2013
This is an exclusive panel discussion with Tim O'Reilly, DJ Patill, John Mattison, and Benjamin West recorded live from Strata Rx 2012.

http://cdn.oreillystatic.com/images/people/weblogs/jim_blomo-50.jpgJim Blomo

How We Build Data Mining Teams at Yelp

By Jim Blomo | June 18, 2013
Starting and growing a data science team doesn't have to be a risky proposition. By balancing long term strategy and technology goals with immediate business demands, your data science team can quickly become productive and enjoy sustained growth.

http://cdn.oreillystatic.com/images/people/weblogs/sheridan_hitchens-50.jpgSheridan Hitchens

Best of Strata + Hadoop World: Moving to Big Data

By Sheridan Hitchens | June 11, 2013
Join us for an exclusive presentation by Sheriden Hitchens recorded live from Strata + Hadoop World 2012.

http://cdn.oreillystatic.com/images/people/weblogs/jon_bruner1-50.jpgJon Bruner

Strata Online Conference: Mobility, Data, and Analytics

By Jon Bruner | June 05, 2013
In this Strata Online event, we'll look at some of the ways the rise of the always-on world is feeding the hungry engines of Big Data.

http://cdn.oreillystatic.com/images/people/weblogs/scott_murray1-50.jpgScott Murray

Data Visualization - The Value of Process

By Scott Murray | March 20, 2013
This webcast talk presented by Scott Murray author of Interactive Data Visualization for the Web, will introduce ideas from conceptual art, connecting them to the daily challenges faced by data visualizers working with code.

http://cdn.oreillystatic.com/images/people/weblogs/winston_chang2-50.jpgWinston Chang

Introduction to Data Visualization with R and ggplot2

By Winston Chang | March 06, 2013
In this webcast presented by Winston Chang, author of R graphics Cookbook, you'll learn the basics of how to create data graphics using R and the popular ggplot2 package.

http://cdn.oreillystatic.com/images/people/weblogs/jeremy_howard-50.jpgJeremy Howard

Deep Learning - The Biggest Data Science Breakthrough of the Decade

By Jeremy Howard | March 05, 2013
In this webcast talk Jeremy Howard, Kaggle's president and chief scientist, will explain exactly what occurred, why it was front-page newsworthy for the New York Times, how it will impact business, and what you need to know to make these new algorithms...

http://cdn.oreillystatic.com/images/people/weblogs/maksim_tsvetovat.jpgMaksim Tsvetovat

Community Detection in Social Media Data

By Maksim Tsvetovat | March 05, 2013
In this webcast talk Maksim Tsvetovat author of Social Network Analysis for Startups will introduce a number of ways to address these issues and present an open-source Python-based toolkit for detecting and visualizing communities in Twitter networks...

http://cdn.oreillystatic.com/images/people/weblogs/wes_mckinney-1.jpgWes McKinney

Building Rich, High Performance Tools for Practical Data Analysis

By Wes McKinney | February 20, 2013
This live webcast is presented by Wes McKinney author of Python for Data Analysis and will be a somewhat advanced, technical talk connecting computer science concepts like data structure design and algorithms with the details of building intuitive, high...

http://cdn.oreillystatic.com/images/people/weblogs/alistair_croll-50.jpgAlistair Croll

Strata Online Conference: Strata Santa Clara 2013 Preview

By Alistair Croll | February 15, 2013
In this free online conference, we'll be showcasing some of the hot topics and thought-provoking speakers who will be joining us for the event.

http://cdn.oreillystatic.com/images/people/weblogs/bitsy_bentley-50.jpgBitsy Bentley

Designing for Data-driven Organizations

By Bitsy Bentley | February 14, 2013
Businesses have access to more data than ever before, but the question of how the data can be leveraged to drive action is at times a daunting task, especially for larger organizations.

http://cdn.oreillystatic.com/images/people/weblogs/scott_murray1-50.jpgScott Murray

Engaging Audiences with Data Visualization

By Scott Murray | February 13, 2013
Join us for a hands-on webcast presented by Scott Murray author of Interactive Data Visualization for the Web, as he guides you through the framework of three avenues of engagement: aesthetic, narrative, and interactive.

http://cdn.oreillystatic.com/images/people/weblogs/cj_date.jpgC.J. Date

View Updating: How to Make it Work

By C.J. Date | January 30, 2013
In this webcast presentation, the overall message is: Views in general are just as updatable as base tables are! Attend this webcast and see why this isn't as extravagant a claim as it might seem.

http://cdn.oreillystatic.com/images/people/weblogs/alistair_croll-50.jpgAlistair Croll

Strata Online Conference: Data Warfare

By Alistair Croll | January 22, 2013
From public policy to elections, from healthcare to the battlefield, our lives rely on the analysis of abundant, connected data. But if data is infrastructure, then that infrastructure's vulnerable. Enemies can confound, confuse, distort, and mislead...

http://cdn.oreillystatic.com/images/people/weblogs/casey_micheline-50.jpgMicheline Casey

What Business People Need to Know About Data Governance

By Micheline Casey | January 15, 2013
In this webcast, Micheline Casey provides an overview of data governance and data management principles that should be applied to big data projects.

http://cdn.oreillystatic.com/images/people/weblogs/david_boyle-50.jpgDavid Boyle

How EMI is Changing the Culture of the Music Industry

By David Boyle | January 08, 2013
In this exclusive webcast, David Boyle will look at how EMI changed itself, and the music industry, by moving from gut instinct and opinions to a data-informed business.

http://cdn.oreillystatic.com/images/people/weblogs/khaled_el_emam.jpgKhaled El Emam

Responsibly Sharing Data Under HIPAA

By Khaled El Emam | October 31, 2012
In this webcast presentation we will first provide an overview of how data can be re-identified, with reference to a number of recent real world examples. This will be followed by a description of how to de-identify health data in a defensible way according...

http://cdn.oreillystatic.com/images/people/weblogs/allen_downey.jpgAllen B. Downey

Bayesian Statistics Made Simple

By Allen B. Downey | October 26, 2012
Join Allen Downey, author of Think Stats: Probability and Statistics for Programmers for an introduction to Bayesian statistics using Python. Bayesian statistical methods are becoming more common and more important, but there are not many resources to...

http://cdn.oreillystatic.com/images/people/weblogs/benjamin_yoskovitz.jpgBenjamin Yoskovitz

Understanding the Value of Lean Analytics: Using Data to Build a Better Startup Faster

By Benjamin Yoskovitz | October 25, 2012
The Lean movement has revolutionized how we create products and companies today. It focuses on customer development and tackling the risky parts first. At the core of this is iteration—a cycle of learning and adapting that's driven by data. Lean...

How to Develop Language Annotations for Machine Learning Algorithms

By James Pustejovsky, Amber Stubbs | October 16, 2012
Text-based data mining and information extraction systems that make use of machine learning techniques require annotated datasets for training the algorithms. In this webcast we will discuss the steps involved in creating your own training corpus for...

http://cdn.oreillystatic.com/images/people/weblogs/tod_fetherling-50.jpgJ. Tod Fetherling

Healthcare 101: Cradle to Grave

By J. Tod Fetherling | October 12, 2012
J. Tod Fetherling presents this 90 minute white board session walking the user through every aspect of the healthcare system from wellness to death.

http://cdn.oreillystatic.com/images/people/weblogs/wes_mckinney-1.jpgWes McKinney

Python for Data Analysis

By Wes McKinney | October 10, 2012
In this hands-on webcast presented by Wes McKinney, author of Python for Data Analysis , he will showcase a number of examples and you will receive an introduction to some of the most important tools in the Python language for data preparation, data ...

http://cdn.oreillystatic.com/images/people/weblogs/julie_steele.jpgJulie Steele

Strata Rx Online Conference: Personalized Medicine

By Julie Steele | October 05, 2012
In this free online conference we will discuss how Microsoft Research has developed a new version of the Linear Mixed Model algorithm that is not only computationally inexpensive, but also is better at finding the true signals that account statistically...

http://cdn.oreillystatic.com/images/people/weblogs/alistair_croll-50.jpgAlistair Croll

Strata + Hadoop World: Join the Data Revolution

By Alistair Croll | October 03, 2012
In this free online conference, we preview some of the hot topics, provocative speakers, and game-changing innovations that are fueling the growth of a data-driven society.

An Introduction to Machine Learning for Hackers

By John Myles White, Drew Conway | September 18, 2012
We'll introduce programmers to two of the most common tools in the machine learning toolkit: linear regression and logistic regression.

http://cdn.oreillystatic.com/images/people/weblogs/lars_george-1.jpgLars George

Choosing Hardware for Hadoop

By Lars George | August 15, 2012
In this webcast we will look at popular reference architectures used by companies across several business verticals, discuss their pros and cons, and their applicability to different use-cases, and conclude with best-practice advise on hardware selection...

Data in Motion

By Alistair Croll, Kaitlin Thaney, Simon Williams, Jacomo Corbo, Neal Lathia, John Graham-Cumming | July 24, 2012
In this Strata Online Conference, we'll look at data and movement across a variety of sports and industries.

http://cdn.oreillystatic.com/images/people/weblogs/edd_dumbill.jpgEdd Wilder-James

Get the (Data) Vote Out

By Edd Wilder-James | June 20, 2012
In this Strata Online Event, we'll look at the way data science is shaping elections, from visualizations to game theory, from understanding issues to targeting voters.

http://cdn.oreillystatic.com/images/people/weblogs/steve_francia.jpgSteve Francia

MongoDB and PHP

By Steve Francia | May 18, 2012
In this webcast presentation by Steve Francia, author of MongoDB and PHP, you will learn how to build elegant database applications with MongoDB and PHP.

http://cdn.oreillystatic.com/images/people/weblogs/alistair_croll-50.jpgAlistair Croll

O'Reilly Strata Online Conference

By Alistair Croll | May 16, 2012
Join us for our seventh Strata online conference, as we look at Data That Matters.

Tim O'Reilly and Dave Campbell Explore How to Accelerate Insights from Data

By Tim O'Reilly, David Campbell | May 14, 2012
Tim O'Reilly, founder and CEO of O'Reilly Media, talks with Microsoft Technical Fellow Dave Campbell about new tools for data.

http://cdn.oreillystatic.com/images/people/weblogs/alan_gates.jpgAlan Gates

Current and Upcoming Work in Pig

By Alan Gates | May 10, 2012
In this webcast, we will cover how Pig can take advantage of changes in Hadoop 0.23.

Under the iceberg: Using APIs to transform your business

By Gregory Brail, Daniel Jacobson, Dan Woods | March 22, 2012
In this webcast presentation join Dan Jacobson , Greg Brail, and Dan Woods as they discuss how business leaders can use APIs to transform as a strategy to transform business through private and public APIs.

http://cdn.oreillystatic.com/images/people/weblogs/jared_rosoff-50.jpgJared Rosoff

MongoDB Schema Design: How to Think Non-Relational

By Jared Rosoff | February 17, 2012
In this webcast we'll provide a number of data modeling rules of thumb, and discuss the tradeoffs of various data modeling strategies.

http://cdn.oreillystatic.com/images/people/weblogs/kord_davis-50.jpgKord Davis

An Introduction to Ethics of Big Data

By Kord Davis | February 16, 2012
The material will address the intersection of ethics and Big Data; what it is and what it isn't. Specifically, how to approach and generate dialog about an abstract subject with direct, real-world implications.

http://cdn.oreillystatic.com/images/people/50/joe_kissell-50.jpgJoe Kissell

Take Control of iCloud

By Joe Kissell | February 03, 2012
In this webcast, veteran Mac author Joe Kissell explains what iCloud can do for you, how to deal with configuration puzzles and compatibility issues, and how best to manage the transition from MobileMe.

http://cdn.oreillystatic.com/images/people/weblogs/john-zablocki-50.jpgJohn Zablocki

Developing with .NET and Couchbase Server

By John Zablocki | January 27, 2012
In this webcast John Zablocki, Developer Advocate at Couchbase, will introduce the .NET client library for Couchbase Server.

http://cdn.oreillystatic.com/images/people/weblogs/alistair_croll-50.jpgAlistair Croll

O'Reilly Strata Online Conference

By Alistair Croll | December 07, 2011
In this online event, we'll look at how Big Data stacks and analytical approaches are gradually finding their way into organizations, as well as the roadblocks that can thwart efforts to become more data-driven.

http://cdn.oreillystatic.com/images/people/weblogs/maksim_tsvetovat.jpgMaksim Tsvetovat

Social Network Analysis -- Finding communities and influencers

By Maksim Tsvetovat | December 06, 2011
A follow-on to Analyzing Social Networks on Twitter, this webcast will concentrate on the social component of Twitter data rather then the questions of data gathering and decomposition.

http://cdn.oreillystatic.com/images/people/weblogs/lars_george-1.jpgLars George

HBase Coprocessors - Deploy shared functionality directly on the cluster

By Lars George | November 04, 2011
This session explains the concepts behind coprocessors and uses examples to show how they can be used to implement data side extensions to the application code.

The Evolution from Private to Public: Is There Privacy in the Digital Age?

By Jim Adler, danah boyd, Terence Craig, Natalie Fonseca, Heather West | October 28, 2011
Join the panelists as they consider the evolution from private to public: how are our worlds colliding in the digital age?

http://cdn.oreillystatic.com/images/people/weblogs/lars_george-1.jpgLars George

HBase Schema Design - Things you need to know

By Lars George | October 14, 2011
This session discusses the basic underlying concepts of the storage layer in HBase and how an application should be combined with the appropriate schema to achieve the best possible performance.

http://cdn.oreillystatic.com/images/people/weblogs/allen_downey.jpgAllen B. Downey

There's Only One Test

By Allen B. Downey | October 04, 2011
People working with real data are often confused about hypothesis testing and paralyzed by the number of tests and their requirements. In this webcast, Allen B. Downey, author of Think Stats, presents a framework for using simple simulations to estimate...

Privacy and Big Data: Is there room for privacy in the age of big data?

By Terence Craig, Mary Ludloff | September 14, 2011
In this webcast, Terence Craig and Mary Ludloff, authors of Privacy and Big Data, ask and answer this question: What level of privacy do you really have in the age of big data?

Designing Data Visualizations

By Julie Steele, Noah Iliinsky | September 06, 2011
This webcast will discuss data visualization. Learn linear processes and best practices so that your message may be transmitted without interference.

People, Data and Dollars — A Preview of Strata NYC

By Edd Wilder-James, Kathryn Dekas, Michael Hugos, Michael Nelson, Hjalmar Gislason, Bill Schmarzo | August 31, 2011
In this special online event, you'll get an inside look at some of the world's leading thinkers and innovators in the fields of business, data, and disruption.

Building Access Applications with SQL Server Databases

| August 09, 2011
In this session we will be demonstrating the construction of DSN's, linking tables, views, and using stored procedures and views in pass-through queries. This will include a discussion of the benefits in using SQL Server Schemas and Synonyms.

http://cdn.oreillystatic.com/images/people/weblogs/mike_halsey3-50.jpgMike Halsey

Securing Your Files and Data in Windows

By Mike Halsey | August 04, 2011
In this webcast, Mike Halsey MVP, the author of Troubleshooting Windows 7 Inside Out will talk you though how to keep your files and data safe from even the worst disaster.

Couchbase: Find Out What the Merger of CouchOne and Membase Means for Users

By J. Chris Anderson, Dustin Sallings | April 19, 2011
In this webinar we'll introduce you to the Membase caching and clustering architecture, and show how CouchDB is a drop-in fit as the storage and query engine.

http://cdn.oreillystatic.com/images/people/weblogs/kristina_chodorow.jpgKristina Chodorow

How Sharding Works

By Kristina Chodorow | February 04, 2011
This talk is a combination of whitepaper and Magic School Bus tour of how MongoDB scales across multiple machines. For applications that outgrow the resources of a single database server, MongoDB can convert to a sharded cluster, automatically managing...

http://cdn.oreillystatic.com/images/people/50/bradford_stephens-50.jpgBradford Stephens

How to Decrease the Pain in Building Distributed Systems

By Bradford Stephens | January 12, 2011
Building distributed systems is painful. Many organizations are approaching the point where their data and application infrastructures are being run on many servers (in the cloud or datacenter). Our software practices don't reflect that, often with disastrous...

http://cdn.oreillystatic.com/images/people/weblogs/hadi_hariri1-50.jpgHadi Hariri

CouchDB for .NET Developers

By Hadi Hariri | December 21, 2010
What does that mean to a .NET Developer? How do we store and retrieve data? How do we query it? If you've been interested in document databases but do not know where to start, then this is definitely the webcast for you. We'll see what CouchDB is about...

http://cdn.oreillystatic.com/images/people/50/ken_goodhope-50.jpgKen Goodhope

Hadoop - Tips, Tricks, Optimizations, and Pitfalls

By Ken Goodhope | November 23, 2010
We'll use real world examples in this webcast that demonstrate how to best utilize MapReduce with Hadoop. We'll also examine the appropriate uses of special partitioners, combiners, and configuration optimizations. We'll expose some common mistakes and...

http://cdn.oreillystatic.com/images/people/50/benjamin_young-50.jpgBenjamin Young

PHP and CouchDB

By Benjamin Young | November 17, 2010
This talk will cover the basics of the CouchDB HTTP API and how to use it from PHP with and without helper libraries. We'll discuss some architecture approaches and briefly look at things to avoid when moving from an RDBMS to a Document Database such...

http://cdn.oreillystatic.com/images/people/50/c_brown-50.jpgC. Titus Brown

Probabilistic Data Structures and Breaking Down Big Sequence Data

By C. Titus Brown | November 10, 2010
Many data analysis problems are not easily parallelizable, often because the relevant analyses require an all-by-all analysis step. Applying heuristics often requires approximation, which introduces errors, noise, and bias. Recently, in confronting the...

http://cdn.oreillystatic.com/images/people/50/kyle_banker-50.jpgKyle Banker

Indexing Matters: A MongoDB Optimization Primer

By Kyle Banker | October 29, 2010
We all know that MongoDB is one of the most flexible and feature-rich databases available. In this session we'll discuss how you can leverage this feature set and maintain high performance with your project's massive data sets and high loads. We'll cover...

http://cdn.oreillystatic.com/images/people/50/kocoloski_adam-50.jpgAdam Kocoloski

Scaling Out CouchDB with BigCouch

By Adam Kocoloski | October 22, 2010
This talk will cover the basics of BigCouch, including deploying and managing your first CouchDB cluster, as well as some advanced features like quorum reads/writes and design patterns for distributed couchdb. Finally, for the erlang hackers out there...

http://cdn.oreillystatic.com/images/people/50/aaron_miller-50.jpgAaron Miller

Using CouchDB on Android

By Aaron Miller | September 22, 2010
Why CouchDB on a phone is awesome, and what you can do with it Deploying existing CouchApps to Android CouchDB Using CouchDB in native Android apps

http://cdn.oreillystatic.com/images/people/weblogs/kristina_chodorow.jpgKristina Chodorow

Scaling with MongoDB

By Kristina Chodorow | September 17, 2010
MongoDB's architecture features built-in support for horizontal scalability, and high availability through replica sets. Auto-sharding allows users to easily distribute data across many nodes. Replica sets enable automatic failover and recovery of database...

http://cdn.oreillystatic.com/images/people/weblogs/tom_white.jpgTom White

The State of Hadoop

By Tom White | September 15, 2010
Apache Hadoop is a part of a growing ecosystem of projects for large-scale data analysis which is being used to solve problems for organizations in a wide range of disciplines. This talk will touch on what's new in the second edition of Hadoop: The Definitive...

http://cdn.oreillystatic.com/images/people/weblogs/jan_jehnardt.jpgJan Lehnardt

Asynchronous architectures with the CouchDB _changes feed

By Jan Lehnardt | August 25, 2010
Learn how to build robust web services using CouchDB's built-in facility for near-realtime updates. We'll explore a few patterns _changes can be used for: Building custom external indexers like CouchDB-Lucene, Powering CouchDB's replication, Real-time...

http://cdn.oreillystatic.com/images/people/weblogs/sean_hull.jpgSean Hull

MySQL Upgrades With No Downtime

By Sean Hull | July 27, 2010
In this webcast we'll discuss a two-node MySQL multi-master replication setup. We'll take the audience step-by-step through the process, and then uses MMM (MySQL Multi-master Manager) to manage & automate the process exposing a virtual IP address...

http://cdn.oreillystatic.com/images/people/weblogs/chris_anderson_2.jpgJ. Chris Anderson

Flexible Scaling with CouchDB Replication / Or how I learned to stop worrying and love Eventual Consistency

By J. Chris Anderson | July 14, 2010
CouchDB is known for having a flexible schemaless JSON storage API. But that is just the tip of the iceberg when it comes to flexibility. In this webcast we'll learn how replication can be used to share data securely, build offline-capable applications...

http://cdn.oreillystatic.com/images/people/weblogs/jan_jehnardt.jpgJan Lehnardt

What's new in CouchDB 0.11 & 1.0

By Jan Lehnardt | June 22, 2010
This webcasts highlights new features and refines in the latest and upcoming release of CouchDB. It rehashes old solutions to problems that are now way easier to solve. We look at how the new features help you make your life and development work easier...

http://cdn.oreillystatic.com/images/people/weblogs/chris_anderson_2.jpgJ. Chris Anderson

CouchApp Evently Guided Hack w/ CouchDB

By J. Chris Anderson | May 20, 2010
Learn to hack jQuery CouchApps -- p2p web applications that can be deployed anywhere there's a CouchDB. Apache CouchDB can host HTML5 apps natively, serving them over HTTP. Learn how to write JavaScript CouchApps which run on both the client and ...

http://cdn.oreillystatic.com/images/people/weblogs/chris_anderson_2.jpgJ. Chris Anderson

Introduction to Apache CouchDB

By J. Chris Anderson | April 21, 2010
CouchDB is a distributed document database accessed via HTTP and JSON and queried using JavaScript Map Reduce. CouchDB focuses on simplicity and reliability, with a data replication model that makes it well suited for mobile and offline applications...

http://cdn.oreillystatic.com/images/people/weblogs/sean_hull.jpgSean Hull

DRBD and MySQL - An HA Match Made In Heaven

By Sean Hull | January 19, 2010
DRBD has grown in popularity as an excellent low-cost high availability solution for MySQL. It provides synchronous replication of your data without MySQL having to worry too much about the details. Combined with Linux Heartbeat, and you have automatic...

http://cdn.oreillystatic.com/images/people/weblogs/michael_milton.gifMichael Milton

Two Big Data Analysis Tricks for Everyone

By Michael Milton | October 28, 2009
Data analysis skills are critical to staying competitive in the 21st century economy. In this webcast the author of Head First Data Analysis, Michael Milton, provides some useful tips for common data problems that everyone faces.

http://cdn.oreillystatic.com/images/people/weblogs/sean_hull.jpgSean Hull

Hands-on: Step-by-step MySQL Clustering Setup

By Sean Hull | August 04, 2009
MySQL's Clustering solution provides some pretty sophisticated functionality. In this webcast we'll take you through getting it up and running on your laptop or single node server, building a sandbox where you can play with the dials and levers and get...

http://cdn.oreillystatic.com/images/people/weblogs/sean_hull.jpgSean Hull

MySQL Replication: Audit, Test, & Verify

By Sean Hull | January 22, 2009
In this live online event, Sean Hull (Oracle and Open Source) will talk about why MySQL slaves get out of sync with the master, both in terms of things that happen in the application and in MySQL's implementation of statement-based replication. He'll...