20Spoken Language Translation in Low-Resource Language
S. Shoba1, Sasithradevi A.1* and S. Deepa2
1Centre for Advanced Data Science, Vellore Institute of Technology (VIT), Chennai, India
2Department of CSE, SRM Institute of Science and Technology, Ramapuram Campus, Chennai, Tamil Nadu, India
Abstract
The real world challenge of individuals to communicate between each other is the language barrier. Most of the languages in India are low-resource ones. People in India use different languages for communication. Various research were carried out by translating the source languages to target language through devices. The technologies used are automatic speech recognition and machine translation (MT) system. There are multiple machine translator devices available in the market and shows considerable performance for high resource languages (HRL). But very few devices are available in use for low-resource languages (LRL). The primary drawback in LRL are the lack of annotated dataset, complex grammar-based system, missing deep analysis, and low production quality. This chapter develops an automatic MT system for LRL by boosting the dataset, improving the quality by evaluating the samples through deep analysis, and increasing the production quality by marketing the need of translation. Neural MT uses artificial neural network to predict the target language for increasing the accuracy. Applying supervised, semi-supervised, and transfer learning model to the spoken word will translate ...
Get Automatic Speech Recognition and Translation for Low Resource Languages now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.