O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Mastering Java Machine Learning

Book Description

Become an advanced practitioner with this progressive set of master classes on application-oriented machine learning

About This Book

  • Comprehensive coverage of key topics in machine learning with an emphasis on both the theoretical and practical aspects
  • More than 15 open source Java tools in a wide range of techniques, with code and practical usage.
  • More than 10 real-world case studies in machine learning highlighting techniques ranging from data ingestion up to analyzing the results of experiments, all preparing the user for the practical, real-world use of tools and data analysis.

Who This Book Is For

This book will appeal to anyone with a serious interest in topics in Data Science or those already working in related areas: ideally, intermediate-level data analysts and data scientists with experience in Java. Preferably, you will have experience with the fundamentals of machine learning and now have a desire to explore the area further, are up to grappling with the mathematical complexities of its algorithms, and you wish to learn the complete ins and outs of practical machine learning.

What You Will Learn

  • Master key Java machine learning libraries, and what kind of problem each can solve, with theory and practical guidance.
  • Explore powerful techniques in each major category of machine learning such as classification, clustering, anomaly detection, graph modeling, and text mining.
  • Apply machine learning to real-world data with methodologies, processes, applications, and analysis.
  • Techniques and experiments developed around the latest specializations in machine learning, such as deep learning, stream data mining, and active and semi-supervised learning.
  • Build high-performing, real-time, adaptive predictive models for batch- and stream-based big data learning using the latest tools and methodologies.
  • Get a deeper understanding of technologies leading towards a more powerful AI applicable in various domains such as Security, Financial Crime, Internet of Things, social networking, and so on.

In Detail

Java is one of the main languages used by practicing data scientists; much of the Hadoop ecosystem is Java-based, and it is certainly the language that most production systems in Data Science are written in. If you know Java, Mastering Machine Learning with Java is your next step on the path to becoming an advanced practitioner in Data Science.

This book aims to introduce you to an array of advanced techniques in machine learning, including classification, clustering, anomaly detection, stream learning, active learning, semi-supervised learning, probabilistic graph modeling, text mining, deep learning, and big data batch and stream machine learning. Accompanying each chapter are illustrative examples and real-world case studies that show how to apply the newly learned techniques using sound methodologies and the best Java-based tools available today.

On completing this book, you will have an understanding of the tools and techniques for building powerful machine learning models to solve data science problems in just about any domain.

Style and approach

A practical guide to help you explore machine learning—and an array of Java-based tools and frameworks—with the help of practical examples and real-world use cases.

Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

Table of Contents

  1. Mastering Java Machine Learning
    1. Table of Contents
    2. Mastering Java Machine Learning
    3. Credits
    4. Foreword
    5. About the Authors
    6. About the Reviewers
    7. www.PacktPub.com
      1. eBooks, discount offers, and more
        1. Why subscribe?
    8. Customer Feedback
    9. Preface
      1. What this book covers
      2. What you need for this book
      3. Who this book is for
      4. Conventions
      5. Reader feedback
      6. Customer support
        1. Errata
        2. Piracy
        3. Questions
    10. 1. Machine Learning Review
      1. Machine learning – history and definition
      2. What is not machine learning?
      3. Machine learning – concepts and terminology
      4. Machine learning – types and subtypes
      5. Datasets used in machine learning
      6. Machine learning applications
      7. Practical issues in machine learning
      8. Machine learning – roles and process
        1. Roles
        2. Process
      9. Machine learning – tools and datasets
        1. Datasets
      10. Summary
    11. 2. Practical Approach to Real-World Supervised Learning
      1. Formal description and notation
        1. Data quality analysis
        2. Descriptive data analysis
          1. Basic label analysis
          2. Basic feature analysis
        3. Visualization analysis
          1. Univariate feature analysis
            1. Categorical features
            2. Continuous features
          2. Multivariate feature analysis
      2. Data transformation and preprocessing
        1. Feature construction
        2. Handling missing values
        3. Outliers
        4. Discretization
        5. Data sampling
          1. Is sampling needed?
          2. Undersampling and oversampling
            1. Stratified sampling
        6. Training, validation, and test set
      3. Feature relevance analysis and dimensionality reduction
        1. Feature search techniques
        2. Feature evaluation techniques
          1. Filter approach
            1. Univariate feature selection
              1. Information theoretic approach
              2. Statistical approach
            2. Multivariate feature selection
              1. Minimal redundancy maximal relevance (mRMR)
              2. Correlation-based feature selection (CFS)
          2. Wrapper approach
          3. Embedded approach
      4. Model building
        1. Linear models
          1. Linear Regression
            1. Algorithm input and output
            2. How does it work?
            3. Advantages and limitations
          2. Naïve Bayes
            1. Algorithm input and output
            2. How does it work?
            3. Advantages and limitations
          3. Logistic Regression
            1. Algorithm input and output
            2. How does it work?
            3. Advantages and limitations
        2. Non-linear models
          1. Decision Trees
            1. Algorithm inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          2. K-Nearest Neighbors (KNN)
            1. Algorithm inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          3. Support vector machines (SVM)
            1. Algorithm inputs and outputs
            2. How does it work?
            3. Advantages and limitations
        3. Ensemble learning and meta learners
          1. Bootstrap aggregating or bagging
            1. Algorithm inputs and outputs
            2. How does it work?
              1. Random Forest
            3. Advantages and limitations
          2. Boosting
            1. Algorithm inputs and outputs
            2. How does it work?
            3. Advantages and limitations
      5. Model assessment, evaluation, and comparisons
        1. Model assessment
        2. Model evaluation metrics
          1. Confusion matrix and related metrics
          2. ROC and PRC curves
          3. Gain charts and lift curves
        3. Model comparisons
          1. Comparing two algorithms
            1. McNemar's Test
              1. Paired-t test
            2. Wilcoxon signed-rank test
          2. Comparing multiple algorithms
            1. ANOVA test
            2. Friedman's test
      6. Case Study – Horse Colic Classification
        1. Business problem
        2. Machine learning mapping
        3. Data analysis
          1. Label analysis
            1. Features analysis
        4. Supervised learning experiments
          1. Weka experiments
            1. Sample end-to-end process in Java
            2. Weka experimenter and model selection
          2. RapidMiner experiments
            1. Visualization analysis
            2. Feature selection
            3. Model process flow
            4. Model evaluation metrics
              1. Evaluation on Confusion Metrics
                    1. ROC Curves, Lift Curves, and Gain Charts
        5. Results, observations, and analysis
      7. Summary
      8. References
    12. 3. Unsupervised Machine Learning Techniques
      1. Issues in common with supervised learning
      2. Issues specific to unsupervised learning
      3. Feature analysis and dimensionality reduction
        1. Notation
        2. Linear methods
          1. Principal component analysis (PCA)
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          2. Random projections (RP)
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          3. Multidimensional Scaling (MDS)
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
        3. Nonlinear methods
          1. Kernel Principal Component Analysis (KPCA)
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          2. Manifold learning
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
      4. Clustering
        1. Clustering algorithms
          1. k-Means
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          2. DBSCAN
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          3. Mean shift
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          4. Expectation maximization (EM) or Gaussian mixture modeling (GMM)
            1. Input and output
            2. How does it work?
            3. Advantages and limitations
          5. Hierarchical clustering
            1. Input and output
            2. How does it work?
            3. Advantages and limitations
          6. Self-organizing maps (SOM)
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
        2. Spectral clustering
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
        3. Affinity propagation
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
        4. Clustering validation and evaluation
          1. Internal evaluation measures
            1. Notation
            2. R-Squared
            3. Dunn's Indices
            4. Davies-Bouldin index
              1. Silhouette's index
          2. External evaluation measures
            1. Rand index
            2. F-Measure
            3. Normalized mutual information index
      5. Outlier or anomaly detection
        1. Outlier algorithms
          1. Statistical-based
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          2. Distance-based methods
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          3. Density-based methods
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          4. Clustering-based methods
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          5. High-dimensional-based methods
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          6. One-class SVM
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
        2. Outlier evaluation techniques
          1. Supervised evaluation
          2. Unsupervised evaluation
      6. Real-world case study
        1. Tools and software
        2. Business problem
        3. Machine learning mapping
        4. Data collection
        5. Data quality analysis
        6. Data sampling and transformation
        7. Feature analysis and dimensionality reduction
          1. PCA
          2. Random projections
          3. ISOMAP
          4. Observations on feature analysis and dimensionality reduction
        8. Clustering models, results, and evaluation
          1. Observations and clustering analysis
        9. Outlier models, results, and evaluation
          1. Observations and analysis
      7. Summary
      8. References
    13. 4. Semi-Supervised and Active Learning
      1. Semi-supervised learning
        1. Representation, notation, and assumptions
        2. Semi-supervised learning techniques
          1. Self-training SSL
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          2. Co-training SSL or multi-view SSL
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          3. Cluster and label SSL
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          4. Transductive graph label propagation
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          5. Transductive SVM (TSVM)
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
        3. Case study in semi-supervised learning
          1. Tools and software
          2. Business problem
          3. Machine learning mapping
          4. Data collection
            1. Data quality analysis
          5. Data sampling and transformation
          6. Datasets and analysis
            1. Feature analysis results
          7. Experiments and results
            1. Analysis of semi-supervised learning
      2. Active learning
        1. Representation and notation
        2. Active learning scenarios
        3. Active learning approaches
          1. Uncertainty sampling
            1. How does it work?
              1. Least confident sampling
              2. Smallest margin sampling
              3. Label entropy sampling
            2. Advantages and limitations
        4. Version space sampling
          1. Query by disagreement (QBD)
            1. How does it work?
              1. Query by Committee (QBC)
            2. How does it work?
        5. Advantages and limitations
        6. Data distribution sampling
          1. How does it work?
            1. Expected model change
            2. Expected error reduction
              1. Variance reduction
              2. Density weighted methods
        7. Advantages and limitations
      3. Case study in active learning
        1. Tools and software
        2. Business problem
        3. Machine learning mapping
        4. Data Collection
        5. Data sampling and transformation
        6. Feature analysis and dimensionality reduction
        7. Models, results, and evaluation
          1. Pool-based scenarios
          2. Stream-based scenarios
        8. Analysis of active learning results
      4. Summary
      5. References
    14. 5. Real-Time Stream Machine Learning
      1. Assumptions and mathematical notations
      2. Basic stream processing and computational techniques
        1. Stream computations
        2. Sliding windows
        3. Sampling
      3. Concept drift and drift detection
        1. Data management
        2. Partial memory
          1. Full memory
          2. Detection methods
            1. Monitoring model evolution
              1. Widmer and Kubat
              2. Drift Detection Method or DDM
              3. Early Drift Detection Method or EDDM
            2. Monitoring distribution changes
              1. Welch's t test
                1. Kolmogorov-Smirnov's test
                2. CUSUM and Page-Hinckley test
          3. Adaptation methods
            1. Explicit adaptation
            2. Implicit adaptation
      4. Incremental supervised learning
        1. Modeling techniques
          1. Linear algorithms
            1. Online linear models with loss functions
              1. Inputs and outputs
              2. How does it work?
              3. Advantages and limitations
            2. Online Naïve Bayes
              1. Inputs and outputs
              2. How does it work?
              3. Advantages and limitations
          2. Non-linear algorithms
            1. Hoeffding trees or very fast decision trees (VFDT)
              1. Inputs and outputs
              2. How does it work?
              3. Advantages and limitations
          3. Ensemble algorithms
            1. Weighted majority algorithm
              1. Inputs and outputs
              2. How does it work?
              3. Advantages and limitations
            2. Online Bagging algorithm
              1. Inputs and outputs
              2. How does it work?
              3. Advantages and limitations
            3. Online Boosting algorithm
              1. Inputs and outputs
              2. How does it work?
              3. Advantages and limitations
        2. Validation, evaluation, and comparisons in online setting
          1. Model validation techniques
            1. Prequential evaluation
            2. Holdout evaluation
            3. Controlled permutations
            4. Evaluation criteria
            5. Comparing algorithms and metrics
      5. Incremental unsupervised learning using clustering
        1. Modeling techniques
          1. Partition based
            1. Online k-Means
              1. Inputs and outputs
              2. How does it work?
              3. Advantages and limitations
          2. Hierarchical based and micro clustering
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
            4. Inputs and outputs
            5. How does it work?
            6. Advantages and limitations
          3. Density based
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          4. Grid based
            1. Inputs and outputs
            2. How does it work?
            3. Advantages and limitations
          5. Validation and evaluation techniques
            1. Key issues in stream cluster evaluation
            2. Evaluation measures
              1. Cluster Mapping Measures (CMM)
              2. V-Measure
              3. Other external measures
      6. Unsupervised learning using outlier detection
        1. Partition-based clustering for outlier detection
          1. Inputs and outputs
          2. How does it work?
          3. Advantages and limitations
        2. Distance-based clustering for outlier detection
          1. Inputs and outputs
          2. How does it work?
            1. Exact Storm
            2. Abstract-C
            3. Direct Update of Events (DUE)
            4. Micro Clustering based Algorithm (MCOD)
            5. Approx Storm
              1. Advantages and limitations
          3. Validation and evaluation techniques
      7. Case study in stream learning
        1. Tools and software
        2. Business problem
        3. Machine learning mapping
        4. Data collection
        5. Data sampling and transformation
          1. Feature analysis and dimensionality reduction
        6. Models, results, and evaluation
          1. Supervised learning experiments
            1. Concept drift experiments
          2. Clustering experiments
          3. Outlier detection experiments
        7. Analysis of stream learning results
      8. Summary
      9. References
    15. 6. Probabilistic Graph Modeling
      1. Probability revisited
        1. Concepts in probability
          1. Conditional probability
          2. Chain rule and Bayes' theorem
          3. Random variables, joint, and marginal distributions
          4. Marginal independence and conditional independence
          5. Factors
            1. Factor types
          6. Distribution queries
            1. Probabilistic queries
            2. MAP queries and marginal MAP queries
      2. Graph concepts
        1. Graph structure and properties
        2. Subgraphs and cliques
        3. Path, trail, and cycles
      3. Bayesian networks
        1. Representation
          1. Definition
          2. Reasoning patterns
            1. Causal or predictive reasoning
            2. Evidential or diagnostic reasoning
            3. Intercausal reasoning
            4. Combined reasoning
          3. Independencies, flow of influence, D-Separation, I-Map
            1. Flow of influence
            2. D-Separation
            3. I-Map
        2. Inference
          1. Elimination-based inference
            1. Variable elimination algorithm
              1. Input and output
              2. How does it work?
              3. Advantages and limitations
            2. Clique tree or junction tree algorithm
              1. Input and output
              2. How does it work?
              3. Advantages and limitations
          2. Propagation-based techniques
            1. Belief propagation
              1. Factor graph
              2. Messaging in factor graph
              3. Input and output
              4. How does it work?
              5. Advantages and limitations
          3. Sampling-based techniques
            1. Forward sampling with rejection
              1. Input and output
              2. How does it work?
              3. Advantages and limitations
        3. Learning
          1. Learning parameters
            1. Maximum likelihood estimation for Bayesian networks
            2. Bayesian parameter estimation for Bayesian network
              1. Prior and posterior using the Dirichlet distribution
          2. Learning structures
            1. Measures to evaluate structures
            2. Methods for learning structures
              1. Constraint-based techniques
                1. Inputs and outputs
                2. How does it work?
                3. Advantages and limitations
              2. Search and score-based techniques
                1. Inputs and outputs
                2. How does it work?
                3. Advantages and limitations
      4. Markov networks and conditional random fields
        1. Representation
          1. Parameterization
            1. Gibbs parameterization
            2. Factor graphs
            3. Log-linear models
          2. Independencies
            1. Global
            2. Pairwise Markov
              1. Markov blanket
        2. Inference
        3. Learning
        4. Conditional random fields
      5. Specialized networks
        1. Tree augmented network
          1. Input and output
          2. How does it work?
          3. Advantages and limitations
        2. Markov chains
          1. Hidden Markov models
          2. Most probable path in HMM
          3. Posterior decoding in HMM
      6. Tools and usage
        1. OpenMarkov
        2. Weka Bayesian Network GUI
      7. Case study
        1. Business problem
        2. Machine learning mapping
        3. Data sampling and transformation
        4. Feature analysis
        5. Models, results, and evaluation
        6. Analysis of results
      8. Summary
      9. References
    16. 7. Deep Learning
      1. Multi-layer feed-forward neural network
        1. Inputs, neurons, activation function, and mathematical notation
        2. Multi-layered neural network
          1. Structure and mathematical notations
          2. Activation functions in NN
            1. Sigmoid function
            2. Hyperbolic tangent ("tanh") function
          3. Training neural network
            1. Empirical risk minimization
              1. Parameter initialization
              2. Loss function
              3. Gradients
                1. Gradient at the output layer
                2. Gradient at the Hidden Layer
                3. Parameter gradient
              4. Feed forward and backpropagation
              5. How does it work?
              6. Regularization
                1. L2 regularization
                2. L1 regularization
      2. Limitations of neural networks
        1. Vanishing gradients, local optimum, and slow training
      3. Deep learning
        1. Building blocks for deep learning
          1. Rectified linear activation function
          2. Restricted Boltzmann Machines
            1. Definition and mathematical notation
            2. Conditional distribution
            3. Free energy in RBM
            4. Training the RBM
            5. Sampling in RBM
            6. Contrastive divergence
              1. Inputs and outputs
              2. How does it work?
            7. Persistent contrastive divergence
          3. Autoencoders
            1. Definition and mathematical notations
            2. Loss function
            3. Limitations of Autoencoders
            4. Denoising Autoencoder
          4. Unsupervised pre-training and supervised fine-tuning
          5. Deep feed-forward NN
            1. Input and outputs
            2. How does it work?
          6. Deep Autoencoders
          7. Deep Belief Networks
            1. Inputs and outputs
            2. How does it work?
          8. Deep learning with dropouts
            1. Definition and mathematical notation
            2. Inputs and outputs
              1. How does it work?
            3. Learning Training and testing with dropouts
          9. Sparse coding
          10. Convolutional Neural Network
            1. Local connectivity
            2. Parameter sharing
            3. Discrete convolution
            4. Pooling or subsampling
            5. Normalization using ReLU
          11. CNN Layers
          12. Recurrent Neural Networks
            1. Structure of Recurrent Neural Networks
            2. Learning and associated problems in RNNs
            3. Long Short Term Memory
            4. Gated Recurrent Units
      4. Case study
        1. Tools and software
        2. Business problem
        3. Machine learning mapping
        4. Data sampling and transfor
        5. Feature analysis
        6. Models, results, and evaluation
          1. Basic data handling
          2. Multi-layer perceptron
            1. Parameters used for MLP
            2. Code for MLP
          3. Convolutional Network
            1. Parameters used for ConvNet
            2. Code for CNN
          4. Variational Autoencoder
            1. Parameters used for the Variational Autoencoder
            2. Code for Variational Autoencoder
          5. DBN
          6. Parameter search using Arbiter
          7. Results and analysis
      5. Summary
      6. References
    17. 8. Text Mining and Natural Language Processing
      1. NLP, subfields, and tasks
        1. Text categorization
        2. Part-of-speech tagging (POS tagging)
        3. Text clustering
        4. Information extraction and named entity recognition
        5. Sentiment analysis and opinion mining
        6. Coreference resolution
        7. Word sense disambiguation
        8. Machine translation
        9. Semantic reasoning and inferencing
        10. Text summarization
        11. Automating question and answers
      2. Issues with mining unstructured data
      3. Text processing components and transformations
        1. Document collection and standardization
          1. Inputs and outputs
          2. How does it work?
        2. Tokenization
          1. Inputs and outputs
          2. How does it work?
        3. Stop words removal
          1. Inputs and outputs
          2. How does it work?
        4. Stemming or lemmatization
          1. Inputs and outputs
          2. How does it work?
        5. Local/global dictionary or vocabulary?
        6. Feature extraction/generation
          1. Lexical features
            1. Character-based features
            2. Word-based features
            3. Part-of-speech tagging features
            4. Taxonomy features
          2. Syntactic features
          3. Semantic features
        7. Feature representation and similarity
          1. Vector space model
            1. Binary
            2. Term frequency (TF)
            3. Inverse document frequency (IDF)
            4. Term frequency-inverse document frequency (TF-IDF)
          2. Similarity measures
            1. Euclidean distance
            2. Cosine distance
            3. Pairwise-adaptive similarity
            4. Extended Jaccard coefficient
            5. Dice coefficient
        8. Feature selection and dimensionality reduction
          1. Feature selection
            1. Information theoretic techniques
            2. Statistical-based techniques
            3. Frequency-based techniques
          2. Dimensionality reduction
      4. Topics in text mining
        1. Text categorization/classification
        2. Topic modeling
          1. Probabilistic latent semantic analysis (PLSA)
            1. Input and output
            2. How does it work?
            3. Advantages and limitations
        3. Text clustering
          1. Feature transformation, selection, and reduction
          2. Clustering techniques
            1. Generative probabilistic models
              1. Input and output
              2. How does it work?
              3. Advantages and limitations
            2. Distance-based text clustering
            3. Non-negative matrix factorization (NMF)
              1. Input and output
              2. How does it work?
              3. Advantages and limitations
          3. Evaluation of text clustering
        4. Named entity recognition
          1. Hidden Markov models for NER
            1. Input and output
            2. How does it work?
            3. Advantages and limitations
          2. Maximum entropy Markov models for NER
            1. Input and output
            2. How does it work?
            3. Advantages and limitations
        5. Deep learning and NLP
      5. Tools and usage
        1. Mallet
        2. KNIME
        3. Topic modeling with mallet
        4. Business problem
        5. Machine Learning mapping
        6. Data collection
        7. Data sampling and transformation
        8. Feature analysis and dimensionality reduction
        9. Models, results, and evaluation
        10. Analysis of text processing results
      6. Summary
      7. References
    18. 9. Big Data Machine Learning – The Final Frontier
      1. What are the characteristics of Big Data?
      2. Big Data Machine Learning
        1. General Big Data framework
          1. Big Data cluster deployment frameworks
            1. Hortonworks Data Platform
            2. Cloudera CDH
            3. Amazon Elastic MapReduce
            4. Microsoft Azure HDInsight
          2. Data acquisition
            1. Publish-subscribe frameworks
            2. Source-sink frameworks
            3. SQL frameworks
            4. Message queueing frameworks
            5. Custom frameworks
          3. Data storage
            1. HDFS
            2. NoSQL
              1. Key-value databases
              2. Document databases
              3. Columnar databases
              4. Graph databases
          4. Data processing and preparation
            1. Hive and HQL
            2. Spark SQL
            3. Amazon Redshift
            4. Real-time stream processing
          5. Machine Learning
          6. Visualization and analysis
      3. Batch Big Data Machine Learning
        1. H2O as Big Data Machine Learning platform
          1. H2O architecture
          2. Machine learning in H2O
          3. Tools and usage
      4. Case study
        1. Business problem
        2. Machine Learning mapping
        3. Data collection
        4. Data sampling and transformation
          1. Experiments, results, and analysis
            1. Feature relevance and analysis
            2. Evaluation on test data
            3. Analysis of results
        5. Spark MLlib as Big Data Machine Learning platform
          1. Spark architecture
          2. Machine Learning in MLlib
          3. Tools and usage
          4. Experiments, results, and analysis
            1. k-Means
            2. k-Means with PCA
            3. Bisecting k-Means (with PCA)
            4. Gaussian Mixture Model
            5. Random Forest
              1. Analysis of results
          5. Real-time Big Data Machine Learning
            1. SAMOA as a real-time Big Data Machine Learning framework
              1. SAMOA architecture
            2. Machine Learning algorithms
            3. Tools and usage
            4. Experiments, results, and analysis
              1. Analysis of results
          6. The future of Machine Learning
          7. Summary
          8. References
    19. A. Linear Algebra
      1. Vector
        1. Scalar product of vectors
      2. Matrix
        1. Transpose of a matrix
          1. Matrix addition
          2. Scalar multiplication
          3. Matrix multiplication
            1. Properties of matrix product
              1. Linear transformation
              2. Matrix inverse
              3. Eigendecomposition
              4. Positive definite matrix
          4. Singular value decomposition (SVD)
    20. B. Probability
      1. Axioms of probability
      2. Bayes' theorem
        1. Density estimation
        2. Mean
        3. Variance
        4. Standard deviation
        5. Gaussian standard deviation
        6. Covariance
        7. Correlation coefficient
        8. Binomial distribution
        9. Poisson distribution
        10. Gaussian distribution
        11. Central limit theorem
        12. Error propagation
    21. Index