book

Hands-On Unsupervised Learning Using Python

by Ankur A. Patel

March 2019

Intermediate to advanced

359 pages

8h 46m

English

O'Reilly Media, Inc.

Read now

Unlock full access

A Brief History of Machine LearningAI Is Back, but Why Now?The Emergence of Applied AIMajor Milestones in Applied AI over the Past 20 YearsFrom Narrow AI to AGIObjective and ApproachPrerequisitesRoadmapConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
Basic Machine Learning TerminologyRules-Based vs. Machine LearningSupervised vs. UnsupervisedThe Strengths and Weaknesses of Supervised LearningThe Strengths and Weaknesses of Unsupervised LearningUsing Unsupervised Learning to Improve Machine Learning SolutionsA Closer Look at Supervised AlgorithmsLinear MethodsNeighborhood-Based MethodsTree-Based MethodsSupport Vector MachinesNeural NetworksA Closer Look at Unsupervised AlgorithmsDimensionality ReductionClusteringFeature ExtractionUnsupervised Deep LearningSequential Data Problems Using Unsupervised LearningReinforcement Learning Using Unsupervised LearningSemisupervised LearningSuccessful Applications of Unsupervised LearningAnomaly DetectionConclusion
Environment SetupVersion Control: GitClone the Hands-On Unsupervised Learning Git RepositoryScientific Libraries: Anaconda Distribution of PythonNeural Networks: TensorFlow and KerasGradient Boosting, Version One: XGBoostGradient Boosting, Version Two: LightGBMClustering AlgorithmsInteractive Computing Environment: Jupyter NotebookOverview of the DataData PreparationData AcquisitionData ExplorationGenerate Feature Matrix and Labels ArrayFeature Engineering and Feature SelectionData VisualizationModel PreparationSplit into Training and Test SetsSelect Cost FunctionCreate k-Fold Cross-Validation SetsMachine Learning Models (Part I)Model #1: Logistic RegressionEvaluation MetricsConfusion MatrixPrecision-Recall CurveReceiver Operating CharacteristicMachine Learning Models (Part II)Model #2: Random ForestsModel #3: Gradient Boosting Machine (XGBoost)Model #4: Gradient Boosting Machine (LightGBM)Evaluation of the Four Models Using the Test SetEnsemblesStackingFinal Model SelectionProduction PipelineConclusion
The Motivation for Dimensionality ReductionThe MNIST Digits DatabaseDimensionality Reduction AlgorithmsLinear Projection vs. Manifold LearningPrincipal Component AnalysisPCA, the ConceptPCA in PracticeIncremental PCASparse PCAKernel PCASingular Value DecompositionRandom ProjectionGaussian Random ProjectionSparse Random ProjectionIsomapMultidimensional ScalingLocally Linear Embeddingt-Distributed Stochastic Neighbor EmbeddingOther Dimensionality Reduction MethodsDictionary LearningIndependent Component AnalysisConclusion
Credit Card Fraud DetectionPrepare the DataDefine Anomaly Score FunctionDefine Evaluation MetricsDefine Plotting FunctionNormal PCA Anomaly DetectionPCA Components Equal Number of Original DimensionsSearch for the Optimal Number of Principal ComponentsSparse PCA Anomaly DetectionKernel PCA Anomaly DetectionGaussian Random Projection Anomaly DetectionSparse Random Projection Anomaly DetectionNonlinear Anomaly DetectionDictionary Learning Anomaly DetectionICA Anomaly DetectionFraud Detection on the Test SetNormal PCA Anomaly Detection on the Test SetICA Anomaly Detection on the Test SetDictionary Learning Anomaly Detection on the Test SetConclusion
MNIST Digits DatasetData PreparationClustering Algorithmsk-Meansk-Means InertiaEvaluating the Clustering Resultsk-Means Accuracyk-Means and the Number of Principal Componentsk-Means on the Original DatasetHierarchical ClusteringAgglomerative Hierarchical ClusteringThe DendrogramEvaluating the Clustering ResultsDBSCANDBSCAN AlgorithmApplying DBSCAN to Our DatasetHDBSCANConclusion
Lending Club DataData PreparationTransform String Format to Numerical FormatImpute Missing ValuesEngineer FeaturesSelect Final Set of Features and Perform ScalingDesignate Labels for EvaluationGoodness of the Clustersk-Means ApplicationHierarchical Clustering ApplicationHDBSCAN ApplicationConclusion

Neural NetworksTensorFlowKerasAutoencoder: The Encoder and the DecoderUndercomplete AutoencodersOvercomplete AutoencodersDense vs. Sparse AutoencodersDenoising AutoencoderVariational AutoencoderConclusion
Data PreparationThe Components of an AutoencoderActivation FunctionsOur First AutoencoderLoss FunctionOptimizerTraining the ModelEvaluating on the Test SetTwo-Layer Undercomplete Autoencoder with Linear Activation FunctionIncreasing the Number of NodesAdding More Hidden LayersNonlinear AutoencoderOvercomplete Autoencoder with Linear ActivationOvercomplete Autoencoder with Linear Activation and DropoutSparse Overcomplete Autoencoder with Linear ActivationSparse Overcomplete Autoencoder with Linear Activation and DropoutWorking with Noisy DatasetsDenoising AutoencoderTwo-Layer Denoising Undercomplete Autoencoder with Linear ActivationTwo-Layer Denoising Overcomplete Autoencoder with Linear ActivationTwo-Layer Denoising Overcomplete Autoencoder with ReLu ActivationConclusion
Data PreparationSupervised ModelUnsupervised ModelSemisupervised ModelThe Power of Supervised and UnsupervisedConclusion
Boltzmann MachinesRestricted Boltzmann MachinesRecommender SystemsCollaborative FilteringThe Netflix PrizeMovieLens DatasetData PreparationDefine the Cost Function: Mean Squared ErrorPerform Baseline ExperimentsMatrix FactorizationOne Latent FactorThree Latent FactorsFive Latent FactorsCollaborative Filtering Using RBMsRBM Neural Network ArchitectureBuild the Components of the RBM ClassTrain RBM Recommender SystemConclusion
Deep Belief Networks in DetailMNIST Image ClassificationRestricted Boltzmann MachinesBuild the Components of the RBM ClassGenerate Images Using the RBM ModelView the Intermediate Feature DetectorsTrain the Three RBMs for the DBNExamine Feature DetectorsView Generated ImagesThe Full DBNHow Training of a DBN WorksTrain the DBNHow Unsupervised Learning Helps Supervised LearningGenerate Images to Build a Better Image ClassifierImage Classifier Using LightGBMSupervised OnlyUnsupervised and Supervised SolutionConclusion
GANs, the ConceptThe Power of GANsDeep Convolutional GANsConvolutional Neural NetworksDCGANs RevisitedGenerator of the DCGANDiscriminator of the DCGANDiscriminator and Adversarial ModelsDCGAN for the MNIST DatasetMNIST DCGAN in ActionSynthetic Image GenerationConclusion
ECG DataApproach to Time Series Clusteringk-ShapeTime Series Clustering Using k-Shape on ECGFiveDaysData PreparationTraining and EvaluationTime Series Clustering Using k-Shape on ECG5000Data PreparationTraining and EvaluationTime Series Clustering Using k-Means on ECG5000Time Series Clustering Using Hierarchical DBSCAN on ECG5000Comparing the Time Series Clustering AlgorithmsFull Run with k-ShapeFull Run with k-MeansFull Run with HDBSCANComparing All Three Time Series Clustering ApproachesConclusion
Supervised LearningUnsupervised LearningScikit-LearnTensorFlow and KerasReinforcement LearningMost Promising Areas of Unsupervised Learning TodayThe Future of Unsupervised LearningFinal Words

Content preview from Hands-On Unsupervised Learning Using Python

Chapter 11. Feature Detection Using Deep Belief Networks

In Chapter 10, we explored restricted Boltzmann machines and used them to build a recommender system for movie ratings. In this chapter, we will stack RBMs together to build deep belief networks (DBNs). DBNs were first introduced by Geoff Hinton at the University of Toronto in 2006.

RBMs have just two layers, a visible layer and a hidden layer; in other words, RBMs are just shallow neural networks. DBNs are made up of multiple RBMs—the hidden layer of one RBM serves as the visible layer of the next RBM. Because they involve many layers, DBNs are deep neural networks. In fact, they are the first type of deep unsupervised neural network we’ve introduced so far.

Shallow unsupervised neural networks, such as RBMs, cannot capture structure in complex data such as images, sound, and text, but DBNs can. DBNs have been used to recognize and cluster images, video capture, sound, and text, although other deep learning methods have surpassed DBNs in performance over the past decade.

Deep Belief Networks in Detail

Like RBMs, DBNs can learn the underlying structure of input and probabilistically reconstruct it. In other words, DBNs—like RBMs—are generative models. And, as with RBMs, the layers in DBNs have connections only between layers but not between units within each layer.

In the DBN, one layer is trained at a time, starting with the very first hidden layer, which, along with the input layer, makes up the first RBM. Once this first ...