Machine Learning for Finance

Book description

A guide to advances in machine learning for financial professionals, with working Python code

Key Features

  • Explore advances in machine learning and how to put them to work in financial industries
  • Clear explanation and expert discussion of how machine learning works, with an emphasis on financial applications
  • Deep coverage of advanced machine learning approaches including neural networks, GANs, and reinforcement learning

Book Description

Machine Learning for Finance explores new advances in machine learning and shows how they can be applied across the financial sector, including in insurance, transactions, and lending. It explains the concepts and algorithms behind the main machine learning techniques and provides example Python code for implementing the models yourself.

The book is based on Jannes Klaas' experience of running machine learning training courses for financial professionals. Rather than providing ready-made financial algorithms, the book focuses on the advanced ML concepts and ideas that can be applied in a wide variety of ways.

The book shows how machine learning works on structured data, text, images, and time series. It includes coverage of generative adversarial learning, reinforcement learning, debugging, and launching machine learning products. It discusses how to fight bias in machine learning and ends with an exploration of Bayesian inference and probabilistic programming.

What you will learn

  • Apply machine learning to structured data, natural language, photographs, and written text
  • How machine learning can detect fraud, forecast financial trends, analyze customer sentiments, and more
  • Implement heuristic baselines, time series, generative models, and reinforcement learning in Python, scikit-learn, Keras, and TensorFlow
  • Dig deep into neural networks, examine uses of GANs and reinforcement learning
  • Debug machine learning applications and prepare them for launch
  • Address bias and privacy concerns in machine learning

Who this book is for

This book is ideal for readers who understand math and Python, and want to adopt machine learning in financial applications. The book assumes college-level knowledge of math and statistics.

Table of contents

  1. Machine Learning for Finance
    1. Table of Contents
    2. Machine Learning for Finance
      1. Why subscribe?
      2. Packt.com
    3. Contributors
      1. About the author
      2. About the reviewer
    4. Preface
      1. Who this book is for
      2. What this book covers
      3. To get the most out of this book
        1. Download the example code files
        2. Download the color images
        3. Conventions used
      4. Get in touch
        1. Reviews
    5. 1. Neural Networks and Gradient-Based Optimization
      1. Our journey in this book
      2. What is machine learning?
      3. Supervised learning
      4. Unsupervised learning
      5. Reinforcement learning
        1. The unreasonable effectiveness of data
        2. All models are wrong
      6. Setting up your workspace
      7. Using Kaggle kernels
        1. Running notebooks locally
          1. Installing TensorFlow
          2. Installing Keras
          3. Using data locally
      8. Using the AWS deep learning AMI
      9. Approximating functions
      10. A forward pass
      11. A logistic regressor
        1. Python version of our logistic regressor
      12. Optimizing model parameters
      13. Measuring model loss
        1. Gradient descent
        2. Backpropagation
        3. Parameter updates
        4. Putting it all together
      14. A deeper network
      15. A brief introduction to Keras
        1. Importing Keras
        2. A two-layer model in Keras
          1. Stacking layers
          2. Compiling the model
          3. Training the model
        3. Keras and TensorFlow
      16. Tensors and the computational graph
      17. Exercises
      18. Summary
    6. 2. Applying Machine Learning to Structured Data
      1. The data
      2. Heuristic, feature-based, and E2E models
      3. The machine learning software stack
      4. The heuristic approach
        1. Making predictions using the heuristic model
        2. The F1 score
        3. Evaluating with a confusion matrix
      5. The feature engineering approach
        1. A feature from intuition – fraudsters don't sleep
        2. Expert insight – transfer, then cash out
        3. Statistical quirks – errors in balances
      6. Preparing the data for the Keras library
        1. One-hot encoding
        2. Entity embeddings
          1. Tokenizing categories
          2. Creating input models
          3. Training the model
      7. Creating predictive models with Keras
        1. Extracting the target
        2. Creating a test set
        3. Creating a validation set
        4. Oversampling the training data
        5. Building the model
          1. Creating a simple baseline
          2. Building more complex models
      8. A brief primer on tree-based methods
        1. A simple decision tree
        2. A random forest
        3. XGBoost
      9. E2E modeling
      10. Exercises
      11. Summary
    7. 3. Utilizing Computer Vision
      1. Convolutional Neural Networks
        1. Filters on MNIST
        2. Adding a second filter
      2. Filters on color images
      3. The building blocks of ConvNets in Keras
        1. Conv2D
          1. Kernel size
          2. Stride size
          3. Padding
          4. Input shape
          5. Simplified Conv2D notation
          6. ReLU activation
        2. MaxPooling2D
        3. Flatten
        4. Dense
        5. Training MNIST
          1. The model
          2. Loading the data
          3. Compiling and training
      4. More bells and whistles for our neural network
        1. Momentum
        2. The Adam optimizer
        3. Regularization
          1. L2 regularization
          2. L1 regularization
          3. Regularization in Keras
        4. Dropout
        5. Batchnorm
      5. Working with big image datasets
      6. Working with pretrained models
        1. Modifying VGG-16
        2. Random image augmentation
          1. Augmentation with ImageDataGenerator
      7. The modularity tradeoff
      8. Computer vision beyond classification
        1. Facial recognition
        2. Bounding box prediction
      9. Exercises
      10. Summary
    8. 4. Understanding Time Series
      1. Visualization and preparation in pandas
        1. Aggregate global feature statistics
        2. Examining the sample time series
        1. Different kinds of stationarity
        2. Why stationarity matters
        3. Making a time series stationary
        4. When to ignore stationarity issues
      2. Fast Fourier transformations
      3. Autocorrelation
      4. Establishing a training and testing regime
      5. A note on backtesting
      6. Median forecasting
      7. ARIMA
      8. Kalman filters
      9. Forecasting with neural networks
        1. Data preparation
          1. Weekdays
      10. Conv1D
      11. Dilated and causal convolution
      12. Simple RNN
      13. LSTM
        1. The carry
      14. Recurrent dropout
      15. Bayesian deep learning
      16. Exercises
      17. Summary
    9. 5. Parsing Textual Data with Natural Language Processing
      1. An introductory guide to spaCy
      2. Named entity recognition
        1. Fine-tuning the NER
      3. Part-of-speech (POS) tagging
      4. Rule-based matching
        1. Adding custom functions to matchers
        2. Adding the matcher to the pipeline
        3. Combining rule-based and learning-based systems
      5. Regular expressions
        1. Using Python's regex module
        2. Regex in pandas
        3. When to use regexes and when not to
      6. A text classification task
      7. Preparing the data
        1. Sanitizing characters
        2. Lemmatization
        3. Preparing the target
        4. Preparing the training and test sets
      8. Bag-of-words
        1. TF-IDF
      9. Topic modeling
      10. Word embeddings
        1. Preprocessing for training with word vectors
        2. Loading pretrained word vectors
        3. Time series models with word vectors
      11. Document similarity with word embeddings
      12. A quick tour of the Keras functional API
      13. Attention
      14. Seq2seq models
        1. Seq2seq architecture overview
        2. The data
        3. Encoding characters
        4. Creating inference models
        5. Making translations
      15. Exercises
      16. Summary
    10. 6. Using Generative Models
      1. Understanding autoencoders
        1. Autoencoder for MNIST
        2. Autoencoder for credit cards
      2. Visualizing latent spaces with t-SNE
      3. Variational autoencoders
        1. MNIST example
        2. Using the Lambda layer
        3. Kullback–Leibler divergence
        4. Creating a custom loss
        5. Using a VAE to generate data
        6. VAEs for an end-to-end fraud detection system
      4. VAEs for time series
      5. GANs
        1. A MNIST GAN
        2. Understanding GAN latent vectors
        3. GAN training tricks
      6. Using less data – active learning
        1. Using labeling budgets efficiently
        2. Leveraging machines for human labeling
        3. Pseudo labeling for unlabeled data
        4. Using generative models
      7. SGANs for fraud detection
      8. Exercises
      9. Summary
    11. 7. Reinforcement Learning for Financial Markets
      1. Catch – a quick guide to reinforcement learning
        1. Q-learning turns RL into supervised learning
        2. Defining the Q-learning model
        3. Training to play Catch
      2. Markov processes and the bellman equation – A more formal introduction to RL
        1. The Bellman equation in economics
      3. Advantage actor-critic models
        1. Learning to balance
        2. Learning to trade
      4. Evolutionary strategies and genetic algorithms
      5. Practical tips for RL engineering
        1. Designing good reward functions
          1. Careful, manual reward shaping
          2. Inverse reinforcement learning
          3. Learning from human preferences
        2. Robust RL
      6. Frontiers of RL
        1. Multi-agent RL
        2. Learning how to learn
        3. Understanding the brain through RL
      7. Exercises
      8. Summary
    12. 8. Privacy, Debugging, and Launching Your Products
      1. Debugging data
        1. How to find out whether your data is up to the task
        2. What to do if you don't have enough data
        3. Unit testing data
        4. Keeping data private and complying with regulations
        5. Preparing the data for training
        6. Understanding which inputs led to which predictions
      2. Debugging your model
        1. Hyperparameter search with Hyperas
        2. Efficient learning rate search
        3. Learning rate scheduling
        4. Monitoring training with TensorBoard
        5. Exploding and vanishing gradients
      3. Deployment
        1. Launching fast
        2. Understanding and monitoring metrics
        3. Understanding where your data comes from
      4. Performance tips
        1. Using the right hardware for your problem
        2. Making use of distributed training with TF estimators
        3. Using optimized layers such as CuDNNLSTM
        4. Optimizing your pipeline
        5. Speeding up your code with Cython
        6. Caching frequent requests
      5. Exercises
      6. Summary
    13. 9. Fighting Bias
      1. Sources of unfairness in machine learning
      2. Legal perspectives
      3. Observational fairness
      4. Training to be fair
      5. Causal learning
        1. Obtaining causal models
        2. Instrument variables
        3. Non-linear causal models
      6. Interpreting models to ensure fairness
      7. Unfairness as complex system failure
        1. Complex systems are intrinsically hazardous systems
        2. Catastrophes are caused by multiple failures
        3. Complex systems run in degraded mode
        4. Human operators both cause and prevent accidents
        5. Accident-free operation requires experience with failure
      8. A checklist for developing fair models
        1. What is the goal of the model developers?
        2. Is the data biased?
        3. Are errors biased?
        4. How is feedback incorporated?
        5. Can the model be interpreted?
        6. What happens to models after deployment?
      9. Exercises
      10. Summary
    14. 10. Bayesian Inference and Probabilistic Programming
      1. An intuitive guide to Bayesian inference
        1. Flat prior
        2. <50% prior
        3. Prior and posterior
        4. Markov Chain Monte Carlo
        5. Metropolis-Hastings MCMC
        6. From probabilistic programming to deep probabilistic programming
      2. Summary
      3. Farewell
      4. Further reading
        1. General data analysis
        2. Sound science in machine learning
        3. General machine learning
        4. General deep learning
        5. Reinforcement learning
        6. Bayesian machine learning
    15. Other Books You May Enjoy
      1. Leave a review - let other readers know what you think
    16. Index

Product information

  • Title: Machine Learning for Finance
  • Author(s): Jannes Klaas
  • Release date: May 2019
  • Publisher(s): Packt Publishing
  • ISBN: 9781789136364