Highlights and use cases from companies that are building the technologies needed to sustain their use of analytics and machine learning.
Data science ideas and resources.
We need to do more than automate model building with autoML; we need to automate tasks at every stage of the data pipeline.
Getting DataOps right is crucial to your late-stage big data projects.
The deployment of big data tools is being held back by the lack of standards in a number of growth areas.
These studies provide a foundation for discussing ethical issues so we can better integrate data ethics in real life.
While models and algorithms garner most of the media coverage, this is a great time to be thinking about building tools in data.
The importance of testing your tools, using multiple tools, and seeking consistency across various interpretability techniques.
Considerations based on experience with Fortune 500 clients.
Privacy-preserving analytics is not only possible, but with GDPR about to come online, it will become necessary to incorporate privacy in your data products.
Learn how Spark 2.3.0+ integrates with K8s clusters on Google Cloud and Azure.
Strata Data London will introduce technologies and techniques; showcase use cases; and highlight the importance of ethics, privacy, and security.
In an era where fake news travels faster than the truth, our communities are at a critical juncture.
A deep dive into model interpretation as a theoretical concept and a high-level overview of Skater.
Comcast’s system of storing schemas and metadata enables data scientists to find, understand, and join data of interest.
How to find promising candidates for upskilling within your organization.
A comparison of the accuracy and performance of Spark-NLP vs. spaCy, and some use case recommendations.
A step-by-step guide to building and running a natural language processing pipeline.
A step-by-step guide to initialize the libraries, load the data, and train a tokenizer model using Spark-NLP and spaCy.
A look at the new streaming SQL engine for Apache Kafka.
Ingest the data you need in an agile manner.
A glimpse into what lies ahead for response automation, model compliance, and repeatable experiments.
A look at the rise of the deep learning library PyTorch and simultaneous advancements in recommender systems.
Without the proper cataloging, curation, and security that self-service data platforms allow, companies are left vulnerable to cybersecurity threats and misinformation.
O’Reilly Media Podcast: David Hsieh, of Qubole, in conversation with John Slocum, of MediaMath.