A new role focused on creating data products and making data science work in production.
The O'Reilly Podcast: Ken Krupa on the challenge of data integration, and a solution.
Nothing says machine learning can't outperform humans, but it's important to realize perfect machine learning doesn't, and won't, exist.
Bas Geerdink details the technology stack for real-time account forecasting at ING, and outlines how Spark is used for outbound communications.
Access to critical data in real time enables workers to generate insights from large amounts of information.
Metadata is central to a modern data architecture.
A possible solution to the complexities that plague big data projects.
June Andrews talks about simple, cost-effective algorithmic computing at scale.
Kurt Brown discusses services in use, such as Genie, Metacat, Charlotte, and Microbots.
There’s money to be made in exhaust data (not just data exhaust).
Scientific use cases show promise, but challenges remain for complex data analytics.
Andra Keay discusses the five laws of robotics design.
Michael Jordan on developing a new platform to support real-time decision-making.
O'Reilly Podcast: Ian Fyfe of Zoomdata on the importance of “speed-of-thought analysis” in modern data environments.
Tips and tools for data janitors.
The present and future of data integration in the cloud.
Transform the way you approach analytics.
Mix-and-match approaches for visualizing data and interpreting machine learning models and results.
How we created an illustrated guide to help you find your way through the data landscape.
Flash flood prediction using machine learning has proven capable in the U.S. and Europe; we're now bringing it to East Africa.
An interview with Greg Meddles, technical lead for healthcare.gov.
The better prepared you are to utilize all the data in your data lake, the more likely you are to be successful.
Validating your data requires asking the right questions and using the right data.
A peek into the clickstream analysis and production pipeline for processing tens of millions of daily clicks, for thousands of articles.
What data scientists need to know about production—and what production should expect from their data scientists.
Best practices and scalable workflows for reproducible data science.
Putting deep learning into practice with new tools, frameworks, and future developments.
Drew Paroski and Gary Orenstein on the rapid spread of machine learning and predictive analytics
How bots, threat intelligence, adversarial machine learning, and deep learning are impacting the security landscape.
Evaluating the state and development of Scala from a data engineering perspective.
The telecommunication industry’s unique position for new revenue opportunities in big data, IoT, and VR
Telcos must regain value from over-the-top services and develop new sources of revenue by leveraging their data and infrastructure.
The O’Reilly Podcast: John Thuma on how businesses can get more than “what happened” from their data.
The O’Reilly Podcast: Bob Montemurro on planning data systems to match needs.
Technical and policy considerations in combatting algorithmic bias.
Learning to act based on long-term payoffs.
Rather than hiring data scientists from outside, consider training your proto data scientists.
It's important in this age of big data to return the original meaning of serendipity and talk about it as a skill.
Deeper neural nets often yield harder optimization problems.
O'Reilly Podcast: Working with databases that go beyond traditional models.
A look at the data pipeline architecture for five key NERSC projects.
Close the time gap between analysis and action to bring about the next wave of improvements in efficiency and reliability—and magic.
This report explores how political data science helps to drive everything from overall strategy and messaging to individual voter contacts and advertising.
Why cross-channel analytics are crucial to empowering business teams with a behavioral view of your customer.
Start planning now to reap the many benefits of connected manufacturing.
The anatomy of an architecture to bring data science into production.
Analytic Ops—DevOps for data science—makes data analysis into a continually evolving process to meet business needs.
Rohit Jain takes an in-depth look at the possibilities and the challenges for companies that long for a single query engine to rule them all.
Systems with weak consistency guarantees can be expensive in unexpected ways.
How combining data and applying time-series techniques can provide insights into a company’s operational strengths and weaknesses.
Techniques to address overfitting, hyperparameter tuning, and model interpretability.
Sean Patrick Murphy describes how data science is helping electric utilities make sense of a stochastic world filled with increasing uncertainty, and reviews several cutting-edge tools for storing and processing big data.
Specialized technical tools are great, but sometimes a general contractor is the best approach.
Predictive-maintenance modeling requires a lot of work, but some can be automated.
With a focus on engineering and infrastructure, this O’Reilly report examines the tools and best practices that leading financial firms are using to migrate data to the cloud, build customer event hubs, and adhere to new rules for governance and security.
This report dives into the IoT industry through a series of illuminating talks and case studies presented at 2015 Strata + Hadoop World Conferences in San Jose, New York, and Singapore.
The difference between failure and success may be the difference between making analytics possible and making it straightforward.
How decoupling, optimization, and specialization resemble connective systems in our bodies.
How well prepared is your organization to innovate, using data science? In this report, two leading data scientists at Booz Allen Hamilton describe 10 characteristics of a mature data science capability.
Daniele Quercia discusses mapping city scents, computational social science, and using sharing economy data to help shape city regulations.
Measure your model’s business impact, not just its accuracy.