Oaths have their value, but checklists will help put principles into practice.
An overview of the challenges MLflow tackles and a primer on how to get started.
Get a basic overview of data engineering and then go deeper with recommended resources.
Data scientists, data engineers, AI and ML developers, and other data professionals need to live ethical values, not just talk about them.
The O’Reilly Data Show Podcast: Aurélie Pols on GDPR, ethics, and ePrivacy.
The importance of testing your tools, using multiple tools, and seeking consistency across various interpretability techniques.
The O’Reilly Data Show Podcast: Andrew Burt and Steven Touw on how companies can manage models they cannot fully explain.
Taking blockchain technology private for the enterprise.
The O’Reilly Data Show Podcast: Ashok Srivastava on the emergence of machine learning and AI for enterprise applications.
Considerations based on experience with Fortune 500 clients.
Why model development does not equal software development.
Martha Lane Fox considers the unintended consequences of technology.
Zubin Siganporia explains how the KISS principle (“Keep It Simple, Stupid”) applies to solving problems and convincing end-users to adopt data-driven solutions to their challenges.
Christine Foster discusses how today’s academic papers turn into tomorrow’s data science.
Having worked in both research and industry, Mikio Braun shares insights into what's the same, what's different, and how deep learning might change the game.
Louise Beaumont explores the five characteristics of companies that choose to succeed.
One of our goals is to bring Jupyter’s enterprise use cases and practices into one place.
The O’Reilly Data Show Podcast: A special episode to mark the 100th episode.
Eva Kaili outlines the fundamentals of GDPR and applications of blockchain.
May 25 is an important day for data protection in the EU and elsewhere. Alison Howard explains how Microsoft has prepared for May 25 and beyond.
Mick Hollison, Sven Löffler, and Robert Neumann explain how Deutsche Telekom is harnessing machine learning and analytics in the cloud to build Europe’s largest IoT data marketplace.
Jean-François Puget explains why human context should be embraced as a guide to building better and smarter systems.
Watch highlights covering machine learning, GDPR, data protection, and more. From the Strata Data Conference in London 2018.
Ben Lorica looks at the problems we’re facing as we collect and store data, particularly when our machine learning models require huge amounts of labeled data.
Pierre Romera explores the challenges in making 1.4 TB of data securely available to journalists all over the world.
Answers to the three most commonly asked questions about maintaining GDPR-compliant machine learning programs.
The O’Reilly Data Show Podcast: Jason Dai on the first year of BigDL and AI in China.
Privacy-preserving analytics is not only possible, but with GDPR about to come online, it will become necessary to incorporate privacy in your data products.
The O’Reilly Data Show Podcast: Jerry Overton on organizing data teams, agile experimentation, and the importance of ethics in data science.
Both reproducible science and open source are necessary for collaboration at scale—the nexus for that intermingling is Jupyter.
Learn how Spark 2.3.0+ integrates with K8s clusters on Google Cloud and Azure.
A failed analytics startup post-mortem.
Discover how data-driven organizations are using Jupyter to analyze data, share insights, and foster practices for dynamic, reproducible data science.
The O’Reilly Data Show Podcast: Guillaume Chaslot on bias and extremism in content recommendations.
The two positions are not interchangeable—and misperceptions of their roles can hurt teams and compromise productivity.
In an era where fake news travels faster than the truth, our communities are at a critical juncture.
Strata Data London will introduce technologies and techniques; showcase use cases; and highlight the importance of ethics, privacy, and security.
The O’Reilly Data Show Podcast: Jesse Anderson and Paco Nathan on organizing data teams and next-generation messaging with Apache Pulsar.
A deep dive into model interpretation as a theoretical concept and a high-level overview of Skater.
Comcast’s system of storing schemas and metadata enables data scientists to find, understand, and join data of interest.
The O’Reilly Data Show Podcast: Ameet Talwalkar on large-scale machine learning.
Eric Colson explains why companies must now think very differently about the role and placement of data science in organizations.
Seth Stephens-Davidowitz explains how to use Google searches to uncover behaviors or attitudes that may be hidden from traditional surveys.
Ajey Gore explains why GO-JEK is focusing its attention beyond urban Indonesia to help people across the country’s rural areas.
William Vambenepe walks through an interesting use case of machine learning in action and discusses the central role AI will play in big data analysis moving forward.
Using silly data sets as examples, Janelle Shane talks about ways that algorithms fail.
Anoop Dawar shares principles successful companies are using to inspire an insight-driven ethos and build data-competent organizations.
How to find promising candidates for upskilling within your organization.
Nancy Lublin and Bob Filbin explore findings from crisis data.
Alex Smola shares lessons learned from AWS SageMaker, an integrated framework for handling all stages of analysis.
Tobias Ternstrom explains why you should objectively evaluate the problem you're trying to solve before choosing the tool to fix it.
Li Fan shows how Pinterest is using AI to predict what’s in an image, what a user wants, and what they’ll want next.
Watch highlights covering machine learning, business intelligence, data privacy, and more. From the Strata Data Conference in San Jose 2018.
Ben Lorica explores emerging security best practices for business intelligence, machine learning, and mobile computing products.
Natalie Evans Harris discusses the Community Principles on Ethical Data Practices (CPEDP), a code of ethics for data collection, sharing, and utilization.
Dinesh Nirmal explains how real-world machine learning reveals assumptions embedded in business processes that cause expensive misunderstandings.
The O’Reilly Data Show Podcast: Ofer Ronen on the current state of chatbots.
A product manager's guide to employing data as a feature.
The O’Reilly Data Show Podcast: Danny Lange on how reinforcement learning can accelerate software development and how it can be democratized.
A comparison of the accuracy and performance of Spark-NLP vs. spaCy, and some use case recommendations.