40 Algorithms Every Programmer Should Know

Book description

Learn algorithms for solving classic computer science problems with this concise guide covering everything from fundamental algorithms, such as sorting and searching, to modern algorithms used in machine learning and cryptography

Key Features

  • Learn the techniques you need to know to design algorithms for solving complex problems
  • Become familiar with neural networks and deep learning techniques
  • Explore different types of algorithms and choose the right data structures for their optimal implementation

Book Description

Algorithms have always played an important role in both the science and practice of computing. Beyond traditional computing, the ability to use algorithms to solve real-world problems is an important skill that any developer or programmer must have. This book will help you not only to develop the skills to select and use an algorithm to solve real-world problems but also to understand how it works.

You’ll start with an introduction to algorithms and discover various algorithm design techniques, before exploring how to implement different types of algorithms, such as searching and sorting, with the help of practical examples. As you advance to a more complex set of algorithms, you'll learn about linear programming, page ranking, and graphs, and even work with machine learning algorithms, understanding the math and logic behind them. Further on, case studies such as weather prediction, tweet clustering, and movie recommendation engines will show you how to apply these algorithms optimally. Finally, you’ll become well versed in techniques that enable parallel processing, giving you the ability to use these algorithms for compute-intensive tasks.

By the end of this book, you'll have become adept at solving real-world computational problems by using a wide range of algorithms.

What you will learn

  • Explore existing data structures and algorithms found in Python libraries
  • Implement graph algorithms for fraud detection using network analysis
  • Work with machine learning algorithms to cluster similar tweets and process Twitter data in real time
  • Predict the weather using supervised learning algorithms
  • Use neural networks for object detection
  • Create a recommendation engine that suggests relevant movies to subscribers
  • Implement foolproof security using symmetric and asymmetric encryption on Google Cloud Platform (GCP)

Who this book is for

This book is for programmers or developers who want to understand the use of algorithms for problem-solving and writing efficient code. Whether you are a beginner looking to learn the most commonly used algorithms in a clear and concise way or an experienced programmer looking to explore cutting-edge algorithms in data science, machine learning, and cryptography, you'll find this book useful. Although Python programming experience is a must, knowledge of data science will be helpful but not necessary.

Table of contents

  1. Title Page
  2. Copyright and Credits
    1. 40 Algorithms Every Programmer Should Know
  3. Dedication
  4. About Packt
    1. Why subscribe?
  5. Contributors
    1. About the author
    2. About the reviewer
    3. Packt is searching for authors like you
  6. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Conventions used
    4. Get in touch
      1. Reviews
  7. Section 1: Fundamentals and Core Algorithms
  8. Overview of Algorithms
    1. What is an algorithm?
      1. The phases of an algorithm
    2. Specifying the logic of an algorithm
      1. Understanding pseudocode
        1. A practical example of pseudocode
      2. Using snippets
      3. Creating an execution plan
    3. Introducing Python packages
      1. Python packages
        1. The SciPy ecosystem
      2. Implementing Python via the Jupyter Notebook
    4. Algorithm design techniques
      1. The data dimension
      2. Compute dimension
        1. A practical example
    5. Performance analysis
      1. Space complexity analysis
      2. Time complexity analysis
      3. Estimating the performance
        1. The best case
        2. The worst case
        3. The average case
      4. Selecting an algorithm
      5. Big O notation
        1. Constant time (O(1)) complexity
        2. Linear time (O(n)) complexity
        3. Quadratic time (O(n2)) complexity
        4. Logarithmic time (O(logn)) complexity
    6. Validating an algorithm
      1. Exact, approximate, and randomized algorithms
      2. Explainability
    7. Summary
  9. Data Structures Used in Algorithms
    1. Exploring data structures in Python
      1. List
        1. Using lists
        2. Lambda functions
        3. The range function
        4. The time complexity of lists
      2. Tuples
        1. The time complexity of tuples
      3. Dictionary
        1. The time complexity of a dictionary
      4. Sets
        1. Time complexity analysis for sets
      5. DataFrames
        1. Terminologies of DataFrames
        2. Creating a subset of a DataFrame
          1. Column selection
          2. Row selection
      6. Matrix
        1. Matrix operations
    2. Exploring abstract data types
      1. Vector
      2. Stacks
        1. The time complexity of stacks
        2. Practical example
      3. Queues
      4. The basic idea behind the use of stacks and queues
      5. Tree
        1. Terminology
        2. Types of trees
        3. Practical examples
    3. Summary
  10. Sorting and Searching Algorithms
    1. Introducing Sorting Algorithms
      1. Swapping Variables in Python
      2. Bubble Sort
        1. Understanding the Logic Behind Bubble Sort
        2. A Performance Analysis of Bubble Sort
      3. Insertion Sort
      4. Merge Sort
      5. Shell Sort
        1. A Performance Analysis of Shell Sort
      6. Selection Sort
        1. The performance of the selection sort algorithm
        2. Choosing a sorting algorithm
    2. Introduction to Searching Algorithms
      1. Linear Search
        1. The Performance of Linear Search
      2. Binary Search
        1. The Performance of Binary Search
      3. Interpolation Search
        1. The Performance of Interpolation Search
    3. Practical Applications
    4. Summary
  11. Designing Algorithms
    1. Introducing the basic concepts of designing an algorithm
      1. Concern 1 – Will the designed algorithm produce the result we expect?
      2. Concern 2 – Is this the optimal way to get these results?
        1. Characterizing the complexity of the problem
      3. Concern 3 – How is the algorithm going to perform on larger datasets?
    2. Understanding algorithmic strategies
      1. Understanding the divide-and-conquer strategy
        1. Practical example – divide-and-conquer applied to Apache Spark
      2. Understanding the dynamic programming strategy
      3. Understanding greedy algorithms
    3. Practical application – solving the TSP
      1. Using a brute-force strategy
      2. Using a greedy algorithm
    4. Presenting the PageRank algorithm
      1. Problem definition
      2. Implementing the PageRank algorithm
    5. Understanding linear programming
      1. Formulating a linear programming problem
        1. Defining the objective function
        2. Specifying constraints
    6. Practical application – capacity planning with linear programming
    7. Summary
  12. Graph Algorithms
    1. Representations of graphs
      1. Types of graphs
        1. Undirected graphs
        2. Directed graphs
        3. Undirected multigraphs
        4. Directed multigraphs
      2. Special types of edges
      3. Ego-centered networks
      4. Social network analysis
    2. Introducing network analysis theory
      1. Understanding the shortest path
      2. Creating a neighborhood
        1. Triangles
        2. Density
      3. Understanding centrality measures
        1. Degree
        2. Betweenness
        3. Fairness and closeness
        4. Eigenvector centrality
      4. Calculating centrality metrics using Python
    3. Understanding graph traversals
      1. Breadth-first search
        1. Initialization
        2. The main loop
      2. Depth-first search
    4. Case study – fraud analytics
      1. Conducting simple fraud analytics
      2. Presenting the watchtower fraud analytics methodology
        1. Scoring negative outcomes
        2. Degree of suspicion
    5. Summary
  13. Section 2: Machine Learning Algorithms
  14. Unsupervised Machine Learning Algorithms
    1. Introducing unsupervised learning
      1. Unsupervised learning in the data-mining life cycle
      2. Current research trends in unsupervised learning
      3. Practical examples
        1. Voice categorization
        2. Document categorization
    2. Understanding clustering algorithms
      1. Quantifying similarities
        1. Euclidean distance
        2. Manhattan distance
        3. Cosine distance
        4. K-means clustering algorithm
        5. The logic of k-means clustering
          1. Initialization
          2. The steps of the k-means algorithm
          3. Stop condition
        6. Coding the k-means algorithm
        7. Limitation of k-means clustering
      2. Hierarchical clustering
        1. Steps of hierarchical clustering
        2. Coding a hierarchical clustering algorithm
      3. Evaluating the clusters
      4. Application of clustering
    3. Dimensionality reduction
      1. Principal component analysis
      2. Limitations of PCA
    4. Association rules mining
      1. Examples of use
      2. Market basket analysis
      3. Association rules
        1. Types of rule
          1. Trivial rules
          2. Inexplicable rules
          3. Actionable rules
      4. Ranking rules
        1. Support
        2. Confidence
        3. Lift
      5. Algorithms for association analysis
        1. Apriori Algorithm
          1. Limitations of the apriori algorithm
        2. FP-growth algorithm
          1. Populating the FP-tree
          2. Mining Frequent Patterns
          3. Code for using FP-growth
    5. Practical application– clustering similar tweets together
      1. Topic modeling
      2. Clustering
    6. Anomaly-detection algorithms
      1. Using clustering
      2. Using density-based anomaly detection
      3. Using support vector machines
    7. Summary
  15. Traditional Supervised Learning Algorithms
    1. Understanding supervised machine learning
      1. Formulating supervised machine learning
      2. Understanding enabling conditions
      3. Differentiating between classifiers and regressors
    2. Understanding classification algorithms
      1. Presenting the classifiers challenge
        1. The problem statement
        2. Feature engineering using a data processing pipeline
          1. Importing data
          2. Feature selection
          3. One-hot encoding
          4. Specifying the features and label
          5. Dividing the dataset into testing and training portions
          6. Scaling the features
      2. Evaluating the classifiers
        1. Confusion matrix
        2. Performance metrics
        3. Understanding overfitting
          1. Bias
          2. Variance
          3. Bias-variance trade-off
      3. Specifying the phases of classifiers
      4. Decision tree classification algorithm
        1. Understanding the decision tree classification algorithm
        2. Using the decision tree classification algorithm for the classifiers challenge
        3. The strengths and weaknesses of decision tree classifiers
          1. Strengths
          2. Weaknesses
        4. Use cases
          1. Classifying records
          2. Feature selection
      5. Understanding the ensemble methods
        1. Implementing gradient boosting with the XGBoost algorithm
        2. Using the random forest algorithm
          1. Training a random forest algorithm
          2. Using random forest for predictions
          3. Differentiating the random forest algorithm from ensemble boosting
        3. Using the random forest algorithm for the classifiers challenge
      6. Logistic regression
        1. Assumptions
        2. Establishing the relationship
          1. The loss and cost functions
        3. When to use logistic regression
        4. Using the logistic regression algorithm for the classifiers challenge
      7. The SVM algorithm
        1. Using the SVM algorithm for the classifiers challenge
      8. Understanding the naive Bayes algorithm
        1. Bayes, theorem
        2. Calculating probabilities
        3. Multiplication rules for AND events
        4. The general multiplication rule
        5. Addition rules for OR events
        6. Using the naive Bayes algorithm for the classifiers challenge
      9. For classification algorithms, the winner is...
    3. Understanding regression algorithms
      1. Presenting the regressors challenge
        1. The problem statement of the regressors challenge
          1. Exploring the historical dataset
        2. Feature engineering using a data processing pipeline
      2. Linear regression
        1. Simple linear regression
        2. Evaluating the regressors
        3. Multiple regression
        4. Using the linear regression algorithm for the regressors challenge
        5. When is linear regression used?
        6. The weaknesses of linear regression
      3. The regression tree algorithm
        1. Using the regression tree algorithm for the regressors challenge
      4. The gradient boost regression algorithm
        1. Using gradient boost regression algorithm for the regressors challenge
      5. For regression algorithms, the winner is...
    4. Practical example – how to predict the weather
    5. Summary
  16. Neural Network Algorithms
    1. Understanding ANNs
    2. The Evolution of ANNs
    3. Training a Neural Network
      1. Understanding the Anatomy of a Neural Network
      2. Defining Gradient Descent
      3. Activation Functions
        1. Threshold Function
        2. Sigmoid
        3. Rectified linear unit (ReLU)
        4. Leaky ReLU
        5. Hyperbolic tangent (tanh)
        6. Softmax
    4. Tools and Frameworks
      1. Keras
        1. Backend Engines of Keras
        2. Low-level layers of the deep learning stack
        3. Defining hyperparameters
        4. Defining a Keras model
        5. Choosing sequential or functional model
      2. Understanding TensorFlow
        1. Presenting TensorFlow's Basic Concepts
          1. Understanding Tensor Mathematics
      3. Understanding the Types of Neural Networks
        1. Convolutional Neural Networks
          1. Convolution
          2. Pooling
        2. Recurrent Neural Networks
        3. Generative Adversarial Networks
    5. Transfer Learning
    6. Case study – using deep learning for fraud detection
      1. Methodology
    7. Summary
  17. Algorithms for Natural Language Processing
    1. Introducing NLP
      1. Understanding NLP terminology
        1. Normalization
        2. Corpus
        3. Tokenization
        4. Named entity recognition
        5. Stopwords
        6. Sentiment analysis
        7. Stemming and lemmatization
      2. NLTK
    2. BoW-based NLP
    3. Introduction to word embedding
      1. The neighborhood of a word
      2. Properties of word embeddings
    4. Using RNNs for NLP
    5. Using NLP for sentiment analysis
    6. Case study: movie review sentiment analysis
    7. Summary
  18. Recommendation Engines
    1. Introducing recommendation systems
    2. Types of recommendation engines
      1. Content-based recommendation engines
        1. Finding similarities between unstructured documents
        2. Using a co-occurrence matrix
      2. Collaborative filtering recommendation engines
      3. Hybrid recommendation engines
        1. Generating a similarity matrix of the items
        2. Generating reference vectors of the users
        3. Generating recommendations
    3. Understanding the limitations of recommender systems
      1. The cold start problem
      2. Metadata requirements
      3. The data sparsity problem
      4. Bias due to social influence
      5. Limited data
    4. Areas of practical applications
    5. Practical example – creating a recommendation engine
    6. Summary
  19. Section 3: Advanced Topics
  20. Data Algorithms
    1. Introduction to data algorithms
      1. Data classification
    2. Presenting data storage algorithms
      1. Understanding data storage strategies
        1. Presenting the CAP theorem
          1. CA systems
          2. AP systems
          3. CP systems
    3. Presenting streaming data algorithms
      1. Applications of streaming
    4. Presenting data compression algorithms
      1. Lossless compression algorithms
        1. Understanding the basic techniques of lossless compression
          1. Huffman coding
    5. A practical example – Twitter real-time sentiment analysis
    6. Summary
  21. Cryptography
    1. Introduction to Cryptography
      1. Understanding the Importance of the Weakest Link
      2. The Basic Terminology
      3. Understanding the Security Requirements
        1. Identifying the Entities
        2. Establishing the Security Goals
        3. Understanding the Sensitivity of the Data
      4. Understanding the Basic Design of Ciphers
        1. Presenting Substitution Ciphers
        2. Understanding Transposition Ciphers
    2. Understanding the Types of Cryptographic Techniques
      1. Using the Cryptographic Hash Function
        1. Implementing cryptographic hash functions
          1. Understanding MD5-tolerated
          2. Understanding SHA
        2. An Application of the Cryptographic Hash Function
      2. Using Symmetric Encryption
        1. Coding Symmetric Encryption
        2. The Advantages of Symmetric Encryption
        3. The Problems with Symmetric Encryption
      3. Asymmetric Encryption
        1. The SSL/TLS Handshaking Algorithm
        2. Public Key Infrastructure
    3. Example – Security Concerns When Deploying a Machine Learning Model
      1. MITM attacks
        1. How to prevent MITM attacks
      2. Avoiding Masquerading
      3. Data and Model Encrpytion
    4. Summary
  22. Large-Scale Algorithms
    1. Introduction to large-scale algorithms
      1. Defining a well-designed, large-scale algorithm
      2. Terminology
        1. Latency
        2. Throughput
        3. Network bisection bandwidth
        4. Elasticity
    2. The design of parallel algorithms
      1. Amdahl's law
        1. Conducting sequential process analysis
          1. Conducting parallel execution analysis
      2. Understanding task granularity
      3. Load balancing
      4. Locality issues
      5. Enabling concurrent processing in Python
    3. Strategizing multi-resource processing
      1. Introducing CUDA
        1. Designing parallel algorithms on CUDA
        2. Using GPUs for data processing in Python
      2. Cluster computing
        1. Implementing data processing in Apache Spark
      3. The hybrid strategy
    4. Summary
  23. Practical Considerations
    1. Introducing practical considerations
      1. The sad story of an AI Twitter Bot
    2. The explainability of an algorithm
      1. Machine learning algorithms and explainability
        1. Presenting strategies for explainability
          1. Implementing explainability
    3. Understanding ethics and algorithms
      1. Problems with learning algorithms
      2. Understanding ethical considerations
        1. Inconclusive evidence
        2. Traceability
        3. Misguided evidence
        4. Unfair outcomes
    4. Reducing bias in models
    5. Tackling NP-hard problems
      1. Simplifying the problem
        1. Example
      2. Customizing a well-known solution to a similar problem
        1. Example
      3. Using a probabilistic method
        1. Example
    6. When to use algorithms
      1. A practical example – black swan events
        1. Four criteria to classify an event as a black swan event
        2. Applying algorithms to black swan events
    7. Summary
  24. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think

Product information

  • Title: 40 Algorithms Every Programmer Should Know
  • Author(s): Imran Ahmad
  • Release date: June 2020
  • Publisher(s): Packt Publishing
  • ISBN: 9781789801217