Intelligent Speech Signal Processing

Book description

Intelligent Speech Signal Processing investigates the utilization of speech analytics across several systems and real-world activities, including sharing data analytics, creating collaboration networks between several participants, and implementing video-conferencing in different application areas. Chapters focus on the latest applications of speech data analysis and management tools across different recording systems. The book emphasizes the multidisciplinary nature of the field, presenting different applications and challenges with extensive studies on the design, development and management of intelligent systems, neural networks and related machine learning techniques for speech signal processing.

  • Highlights different data analytics techniques in speech signal processing, including machine learning and data mining
  • Illustrates different applications and challenges across the design, implementation and management of intelligent systems and neural networks techniques for speech signal processing
  • Includes coverage of biomodal speech recognition, voice activity detection, spoken language and speech disorder identification, automatic speech to speech summarization, and convolutional neural networks

Table of contents

  1. Cover image
  2. Title page
  3. Table of Contents
  4. Copyright
  5. Contributors
  6. About the Editor
  7. Preface
  8. Chapter 1: Speech Processing in Healthcare: Can We Integrate?
    1. Abstract
  9. Chapter 2: End-to-End Acoustic Modeling Using Convolutional Neural Networks
    1. Abstract
    2. 2.1 Introduction
    3. 2.2 Related Work
    4. 2.3 Various Architecture of ASR
    5. 2.4 Convolutional Neural Networks
    6. 2.5 CNN-Based End-to-End Approach
    7. 2.6 Experiments and Their Results
    8. 2.7 Conclusion
  10. Chapter 3: A Real-Time DSP-Based System for Voice Activity Detection and Background Noise Reduction
    1. Abstract
    2. 3.1 Introduction
    3. 3.2 Microchip dsPIC33 Digital Signal Controller
    4. 3.3 High Pass Filter
    5. 3.4 Fast Fourier Transform
    6. 3.5 Channel Energy Computation
    7. 3.6 Channel SNR Computation
    8. 3.7 VAD Decision
    9. 3.8 VAD Hangover
    10. 3.9 Computation of Scaling Factor
    11. 3.10 Scaling of Frequency Channels
    12. 3.11 Inverse Fourier Transform
    13. 3.12 Application Programming Interface
    14. 3.13 Resource Requirements
    15. 3.14 Microchip PIC Programmer
    16. 3.15 Audio Components
    17. 3.16 VAD and Background Noise Reduction Techniques
    18. 3.17 Results and Discussion
    19. 3.18 Conclusion and Discussion
  11. Chapter 4: Disambiguating Conflicting Classification Results in AVSR
    1. Abstract
    2. 4.1 Introduction
    3. 4.2 Detection of Conflicting Classes
    4. 4.3 Complementary Models for Classification
    5. 4.4 Proposed Cascade of Classifiers
    6. 4.5 Audio-Visual Databases
    7. 4.6 Experimental Results
    8. 4.7 Conclusions
  12. Chapter 5: A Deep Dive Into Deep Learning Techniques for Solving Spoken Language Identification Problems
    1. Abstract
    2. 5.1 Introduction
    3. 5.2 Spoken Language Identification
    4. 5.3 Cues for Spoken Language Identification
    5. 5.4 Stages in Spoken Language Identification
    6. 5.5 Deep Learning
    7. 5.6 Artificial and Deep Neural Network
    8. 5.7 Comparison of Spoken LID System Implementations with Deep Learning Techniques
    9. 5.8 Discussion
    10. 5.9 Conclusion
  13. Chapter 6: Voice Activity Detection-Based Home Automation System for People With Special Needs
    1. Abstract
    2. 6.1 Introduction
    3. 6.2 Conceptual Design of the System
    4. 6.3 System Implementation
    5. 6.4 Significance/Contribution
    6. 6.5 Conclusion
  14. Chapter 7: Speech Summarization for Tamil Language
    1. Abstract
    2. 7.1 Introduction
    3. 7.2 Extractive Summarization
    4. 7.3 Abstractive Summarization
    5. 7.4 Need for Speech Summarization
    6. 7.5 Issues in the Summarization of a Spoken Document
    7. 7.6 Tamil Language
    8. 7.7 System Design for Summarization of Speech Data in Tamil Language
    9. 7.8 Evaluation Metrics
    10. 7.9 Speech Corpora for Tamil Language
    11. 7.10 Conclusion
  15. Chapter 8: Classifying Recurrent Dynamics on Emotional Speech Signals
    1. Abstract
    2. 8.1 Introduction
    3. 8.2 Data Collection and Processing
    4. 8.3 Research Methodology
    5. 8.4 Numerical Experiments and Results
    6. 8.5 Conclusion
  16. Chapter 9: Intelligent Speech Processing in the Time-Frequency Domain
    1. Abstract
    2. 9.1 Wavelet Packet Decomposition
    3. 9.2 Empirical Mode Decomposition
    4. 9.3 Variational Mode Decomposition
    5. 9.4 Synchrosqueezing Wavelet Transform: EMD Like a Tool
    6. 9.5 Applications of the Decomposition Technique
    7. 9.6 Conclusion
  17. Chapter 10: A Framework for Artificially Intelligent Customized Voice Response System Design using Speech Synthesis Markup Language
    1. Abstract
    2. 10.1 Introduction
    3. 10.2 Literature Survey
    4. 10.3 AWS IoT
    5. 10.4 Amazon Voice Service (AVS)
    6. 10.5 AWS Lambda
    7. 10.6 Message Queuing Telemetry Transport (MQTT)
    8. 10.7 Proposed Architecture
    9. 10.8 Conclusion
  18. Index

Product information

  • Title: Intelligent Speech Signal Processing
  • Author(s): Nilanjan Dey
  • Release date: March 2019
  • Publisher(s): Academic Press
  • ISBN: 9780128181317