book

Hands-On Unsupervised Learning Using Python

by Ankur A. Patel

March 2019

Intermediate to advanced

359 pages

8h 46m

English

O'Reilly Media, Inc.

Read now

Unlock full access

A Brief History of Machine LearningAI Is Back, but Why Now?The Emergence of Applied AIMajor Milestones in Applied AI over the Past 20 YearsFrom Narrow AI to AGIObjective and ApproachPrerequisitesRoadmapConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
Basic Machine Learning TerminologyRules-Based vs. Machine LearningSupervised vs. UnsupervisedThe Strengths and Weaknesses of Supervised LearningThe Strengths and Weaknesses of Unsupervised LearningUsing Unsupervised Learning to Improve Machine Learning SolutionsA Closer Look at Supervised AlgorithmsLinear MethodsNeighborhood-Based MethodsTree-Based MethodsSupport Vector MachinesNeural NetworksA Closer Look at Unsupervised AlgorithmsDimensionality ReductionClusteringFeature ExtractionUnsupervised Deep LearningSequential Data Problems Using Unsupervised LearningReinforcement Learning Using Unsupervised LearningSemisupervised LearningSuccessful Applications of Unsupervised LearningAnomaly DetectionConclusion
Environment SetupVersion Control: GitClone the Hands-On Unsupervised Learning Git RepositoryScientific Libraries: Anaconda Distribution of PythonNeural Networks: TensorFlow and KerasGradient Boosting, Version One: XGBoostGradient Boosting, Version Two: LightGBMClustering AlgorithmsInteractive Computing Environment: Jupyter NotebookOverview of the DataData PreparationData AcquisitionData ExplorationGenerate Feature Matrix and Labels ArrayFeature Engineering and Feature SelectionData VisualizationModel PreparationSplit into Training and Test SetsSelect Cost FunctionCreate k-Fold Cross-Validation SetsMachine Learning Models (Part I)Model #1: Logistic RegressionEvaluation MetricsConfusion MatrixPrecision-Recall CurveReceiver Operating CharacteristicMachine Learning Models (Part II)Model #2: Random ForestsModel #3: Gradient Boosting Machine (XGBoost)Model #4: Gradient Boosting Machine (LightGBM)Evaluation of the Four Models Using the Test SetEnsemblesStackingFinal Model SelectionProduction PipelineConclusion
The Motivation for Dimensionality ReductionThe MNIST Digits DatabaseDimensionality Reduction AlgorithmsLinear Projection vs. Manifold LearningPrincipal Component AnalysisPCA, the ConceptPCA in PracticeIncremental PCASparse PCAKernel PCASingular Value DecompositionRandom ProjectionGaussian Random ProjectionSparse Random ProjectionIsomapMultidimensional ScalingLocally Linear Embeddingt-Distributed Stochastic Neighbor EmbeddingOther Dimensionality Reduction MethodsDictionary LearningIndependent Component AnalysisConclusion
Credit Card Fraud DetectionPrepare the DataDefine Anomaly Score FunctionDefine Evaluation MetricsDefine Plotting FunctionNormal PCA Anomaly DetectionPCA Components Equal Number of Original DimensionsSearch for the Optimal Number of Principal ComponentsSparse PCA Anomaly DetectionKernel PCA Anomaly DetectionGaussian Random Projection Anomaly DetectionSparse Random Projection Anomaly DetectionNonlinear Anomaly DetectionDictionary Learning Anomaly DetectionICA Anomaly DetectionFraud Detection on the Test SetNormal PCA Anomaly Detection on the Test SetICA Anomaly Detection on the Test SetDictionary Learning Anomaly Detection on the Test SetConclusion
MNIST Digits DatasetData PreparationClustering Algorithmsk-Meansk-Means InertiaEvaluating the Clustering Resultsk-Means Accuracyk-Means and the Number of Principal Componentsk-Means on the Original DatasetHierarchical ClusteringAgglomerative Hierarchical ClusteringThe DendrogramEvaluating the Clustering ResultsDBSCANDBSCAN AlgorithmApplying DBSCAN to Our DatasetHDBSCANConclusion
Lending Club DataData PreparationTransform String Format to Numerical FormatImpute Missing ValuesEngineer FeaturesSelect Final Set of Features and Perform ScalingDesignate Labels for EvaluationGoodness of the Clustersk-Means ApplicationHierarchical Clustering ApplicationHDBSCAN ApplicationConclusion

Neural NetworksTensorFlowKerasAutoencoder: The Encoder and the DecoderUndercomplete AutoencodersOvercomplete AutoencodersDense vs. Sparse AutoencodersDenoising AutoencoderVariational AutoencoderConclusion
Data PreparationThe Components of an AutoencoderActivation FunctionsOur First AutoencoderLoss FunctionOptimizerTraining the ModelEvaluating on the Test SetTwo-Layer Undercomplete Autoencoder with Linear Activation FunctionIncreasing the Number of NodesAdding More Hidden LayersNonlinear AutoencoderOvercomplete Autoencoder with Linear ActivationOvercomplete Autoencoder with Linear Activation and DropoutSparse Overcomplete Autoencoder with Linear ActivationSparse Overcomplete Autoencoder with Linear Activation and DropoutWorking with Noisy DatasetsDenoising AutoencoderTwo-Layer Denoising Undercomplete Autoencoder with Linear ActivationTwo-Layer Denoising Overcomplete Autoencoder with Linear ActivationTwo-Layer Denoising Overcomplete Autoencoder with ReLu ActivationConclusion
Data PreparationSupervised ModelUnsupervised ModelSemisupervised ModelThe Power of Supervised and UnsupervisedConclusion
Boltzmann MachinesRestricted Boltzmann MachinesRecommender SystemsCollaborative FilteringThe Netflix PrizeMovieLens DatasetData PreparationDefine the Cost Function: Mean Squared ErrorPerform Baseline ExperimentsMatrix FactorizationOne Latent FactorThree Latent FactorsFive Latent FactorsCollaborative Filtering Using RBMsRBM Neural Network ArchitectureBuild the Components of the RBM ClassTrain RBM Recommender SystemConclusion
Deep Belief Networks in DetailMNIST Image ClassificationRestricted Boltzmann MachinesBuild the Components of the RBM ClassGenerate Images Using the RBM ModelView the Intermediate Feature DetectorsTrain the Three RBMs for the DBNExamine Feature DetectorsView Generated ImagesThe Full DBNHow Training of a DBN WorksTrain the DBNHow Unsupervised Learning Helps Supervised LearningGenerate Images to Build a Better Image ClassifierImage Classifier Using LightGBMSupervised OnlyUnsupervised and Supervised SolutionConclusion
GANs, the ConceptThe Power of GANsDeep Convolutional GANsConvolutional Neural NetworksDCGANs RevisitedGenerator of the DCGANDiscriminator of the DCGANDiscriminator and Adversarial ModelsDCGAN for the MNIST DatasetMNIST DCGAN in ActionSynthetic Image GenerationConclusion
ECG DataApproach to Time Series Clusteringk-ShapeTime Series Clustering Using k-Shape on ECGFiveDaysData PreparationTraining and EvaluationTime Series Clustering Using k-Shape on ECG5000Data PreparationTraining and EvaluationTime Series Clustering Using k-Means on ECG5000Time Series Clustering Using Hierarchical DBSCAN on ECG5000Comparing the Time Series Clustering AlgorithmsFull Run with k-ShapeFull Run with k-MeansFull Run with HDBSCANComparing All Three Time Series Clustering ApproachesConclusion
Supervised LearningUnsupervised LearningScikit-LearnTensorFlow and KerasReinforcement LearningMost Promising Areas of Unsupervised Learning TodayThe Future of Unsupervised LearningFinal Words

Content preview from Hands-On Unsupervised Learning Using Python

Preface

A Brief History of Machine Learning

Machine learning is a subfield of artificial intelligence (AI) in which computers learn from data—usually to improve their performance on some narrowly defined task—without being explicitly programmed. The term machine learning was coined as early as 1959 (by Arthur Samuel, a legend in the field of AI), but there were few major commercial successes in machine learning during the twenty-first century. Instead, the field remained a niche research area for academics at universities.

Early on (in the 1960s) many in the AI community were too optimistic about its future. Researchers at the time, such as Herbert Simon and Marvin Minsky, claimed that AI would reach human-level intelligence within a matter of decades:¹

Machines will be capable, within twenty years, of doing any work a man can do.

Herbert Simon, 1965

From three to eight years, we will have a machine with the general intelligence of an average human being.

Marvin Minsky, 1970

Blinded by their optimism, researchers focused on so-called strong AI or general artificial intelligence (AGI) projects, attempting to build AI agents capable of problem solving, knowledge representation, learning and planning, natural language processing, perception, and motor control. This optimism helped attract significant funding into the nascent field from major players such as the Department of Defense, but the problems these researchers tackled were too ambitious and ultimately doomed to fail. ...