10FPGA-Based Automatic Speech Emotion Recognition Using Deep Learning Algorithm

Rupali Kawade*, Triveni Dhamale and Dipali Dhake

PCET’s Pimpri Chinchwad College of Engineering & Reseach, Ravet, Pune, India

Abstract

There is increasing research in the field of speech emotion recognition (SER) due to its applicability in human computer interfaces (HCI). The literature reviewed in this area proposed different systems to recognize the emotional status of person through speech, and their studies focuses on use of appropriate databases, selection of suitable features and classifications techniques to improve the recognition accuracy. Researchers have been recently demonstrated deep learning techniques as an alternative to traditional SER techniques that reduces the need of identifying the handcrafted features. The high-dimensional features of proposed deep learning algorithm limit its implementations on the standalone processing boards. This article presents the implementation of deep learning–based SER on multicore programmable PYNQ-ZQ board that gives adaptability to the multidimensional deep features of speech signals. The proposed SER system is successfully implemented on the PYNQ-ZQ FPGA board and it results in an accuracy of 85.33%. It is noted that the FPGA implementation minimizes the delay for the SER compared with conventional central processing unit.

Keywords: Speech emotion recognition, human computer interface, deep learning

10.1 Introduction

To interact among humans ...

Get Artificial Intelligence Applications and Reconfigurable Architectures now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.