Answers to the three most commonly asked questions about maintaining GDPR-compliant machine learning programs.
Privacy-preserving analytics is not only possible, but with GDPR about to come online, it will become necessary to incorporate privacy in your data products.
Learn how Spark 2.3.0+ integrates with K8s clusters on Google Cloud and Azure.
The two positions are not interchangeable—and misperceptions of their roles can hurt teams and compromise productivity.
In an era where fake news travels faster than the truth, our communities are at a critical juncture.
Strata Data London will introduce technologies and techniques; showcase use cases; and highlight the importance of ethics, privacy, and security.
A deep dive into model interpretation as a theoretical concept and a high-level overview of Skater.
Comcast’s system of storing schemas and metadata enables data scientists to find, understand, and join data of interest.
How to find promising candidates for upskilling within your organization.
A comparison of the accuracy and performance of Spark-NLP vs. spaCy, and some use case recommendations.
A step-by-step guide to building and running a natural language processing pipeline.
A step-by-step guide to initialize the libraries, load the data, and train a tokenizer model using Spark-NLP and spaCy.
A look at the new streaming SQL engine for Apache Kafka.
Ingest the data you need in an agile manner.
A glimpse into what lies ahead for response automation, model compliance, and repeatable experiments.
Decoding simple regex features to match complex text patterns.
A look at the rise of the deep learning library PyTorch and simultaneous advancements in recommender systems.
Without the proper cataloging, curation, and security that self-service data platforms allow, companies are left vulnerable to cybersecurity threats and misinformation.
O’Reilly Media Podcast: David Hsieh, of Qubole, in conversation with John Slocum, of MediaMath.
A survey of usage, access methods, projects, and skills.
Drawing parallels and distinctions around neural networks, data sets, and hardware.
Analyzing tweets and posts around Trump, Russia, and the NFL using information entropy, network analysis, and community detection algorithms.
Reduce troubleshooting time from days to seconds.
The convergence of big data, artificial intelligence, and business intelligence
Solving challenges of data analytics to make data accessible to all.
Fast data and virtualization are shifting the way telcos approach the IoT.
The right AI solution is the one that fits the skill set of the users and solves the highest-priority problems for the business.
To become a “machine learning company,” you need tools and processes to overcome challenges in data, engineering, and models.
The O'Reilly Podcast: Han Yang on the importance of investment, innovation, and improvisation.
Applying methods from Agile software development to data science projects.
Untangling data pipelines with a streaming platform.
Become more agile with business intelligence and data analytics.
How human-in-the-loop data analytics is accelerating the discovery of insights.
The O’Reilly Podcast: Achieving greater reliability and security when integrating data.
The O'Reilly Podcast: Gary Orenstein on developing a data infrastructure that enables the latest applications in machine learning and AI.
Utilizing GPU power to improve performance and agility.
A deep dive into Uber's engineering effort to optimize geospatial queries in Presto.
The O'Reilly Podcast: Dave Cassel on building a unified enterprise database to store and query any type of data.
6 lessons learned to get a quick start on productivity.
A look at the Layer API, TFLearn, and Keras.
Building a production-grade real-time image classification system.
Applications of CNNs for real-time image classification in the enterprise.
Why machine learning needs real-time data infrastructure.
Recent trends in practical use and a discussion of key bottlenecks in supervised machine learning.
The toughest part of machine learning with Spark isn't what you think it is.
Human-guided ML pipelines for data unification and cleaning might be the only way to provide complete and trustworthy data sets for effective analytics.
Using a single cloud provider is a thing of the past.
Practical questions to help you make a decision.
Tamr’s Eliot Knudsen on algorithms that work alongside human experts.
A multi-model approach to transforming data from a liability to an asset.
A framework for moving from data to wisdom.
Authors Julia Silge and David Robinson discuss the power of tidy data principles, sentiment lexicons, and what they're up to at Stack Overflow.
Recapping winners of the Strata San Jose Startup Showcase.
Stewart Rogers on building and managing products with embedded analytics.
A new architecture for today’s data-rich modern applications.
Integrate and access any form of data using a multi-model database.
Exploring a reference architecture solution.
Overcome three types of debt to ship quality machine learning code.
A new role focused on creating data products and making data science work in production.
The O'Reilly Podcast: Ken Krupa on the challenge of data integration, and a solution.