Automatic Speech Recognition

Automatic Speech Recognition

This book provides a comprehensive overview of the recent advancement in the field of automatic speech recognition with a focus on deep learning models including deep neural networks and many of their variants.

Author: Dong Yu

Publisher: Springer

ISBN: 9781447157793

Category: Technology & Engineering

Page: 321

View: 720

This book provides a comprehensive overview of the recent advancement in the field of automatic speech recognition with a focus on deep learning models including deep neural networks and many of their variants. This is the first automatic speech recognition book dedicated to the deep learning approach. In addition to the rigorous mathematical treatment of the subject, the book also presents insights and theoretical foundation of a series of highly successful deep learning models.
Categories: Technology & Engineering

Robust Adaptation to Non Native Accents in Automatic Speech Recognition

Robust Adaptation to Non Native Accents in Automatic Speech Recognition

A remaining problem however is the robustness of this technology to non-native accents, which still cause considerable difficulties for current systems. In this book, methods to overcome this problem are described.

Author: Silke Goronzy

Publisher: Springer

ISBN: 9783540362906

Category: Computers

Page: 146

View: 293

Speech recognition technology is being increasingly employed in human-machine interfaces. A remaining problem however is the robustness of this technology to non-native accents, which still cause considerable difficulties for current systems. In this book, methods to overcome this problem are described. A speaker adaptation algorithm that is capable of adapting to the current speaker with just a few words of speaker-specific data based on the MLLR principle is developed and combined with confidence measures that focus on phone durations as well as on acoustic features. Furthermore, a specific pronunciation modelling technique that allows the automatic derivation of non-native pronunciations without using non-native data is described and combined with the previous techniques to produce a robust adaptation to non-native accents in an automatic speech recognition system.
Categories: Computers

Robustness in Automatic Speech Recognition

Robustness in Automatic Speech Recognition

Foreword Looking back the past 30 years. we have seen steady progress made in the area of speech science and technology.

Author: Jean-Claude Junqua

Publisher: Springer Science & Business Media

ISBN: 9781461312970

Category: Technology & Engineering

Page: 440

View: 737

Foreword Looking back the past 30 years. we have seen steady progress made in the area of speech science and technology. I still remember the excitement in the late seventies when Texas Instruments came up with a toy named "Speak-and-Spell" which was based on a VLSI chip containing the state-of-the-art linear prediction synthesizer. This caused a speech technology fever among the electronics industry. Particularly. applications of automatic speech recognition were rigorously attempt ed by many companies. some of which were start-ups founded just for this purpose. Unfortunately. it did not take long before they realized that automatic speech rec ognition technology was not mature enough to satisfy the need of customers. The fever gradually faded away. In the meantime. constant efforts have been made by many researchers and engi neers to improve the automatic speech recognition technology. Hardware capabilities have advanced impressively since that time. In the past few years. we have been witnessing and experiencing the advent of the "Information Revolution." What might be called the second surge of interest to com mercialize speech technology as a natural interface for man-machine communication began in much better shape than the first one. With computers much more powerful and faster. many applications look realistic this time. However. there are still tremendous practical issues to be overcome in order for speech to be truly the most natural interface between humans and machines.
Categories: Technology & Engineering

Robust Automatic Speech Recognition

Robust Automatic Speech Recognition

The book covers noise-robust techniques designed for acoustic models which are based on both Gaussian mixture models and deep neural networks. In addition, a guide to selecting the best methods for practical applications is provided.

Author: Jinyu Li

Publisher: Academic Press

ISBN: 9780128026168

Category: Technology & Engineering

Page: 306

View: 411

Robust Automatic Speech Recognition: A Bridge to Practical Applications establishes a solid foundation for automatic speech recognition that is robust against acoustic environmental distortion. It provides a thorough overview of classical and modern noise-and reverberation robust techniques that have been developed over the past thirty years, with an emphasis on practical methods that have been proven to be successful and which are likely to be further developed for future applications. The strengths and weaknesses of robustness-enhancing speech recognition techniques are carefully analyzed. The book covers noise-robust techniques designed for acoustic models which are based on both Gaussian mixture models and deep neural networks. In addition, a guide to selecting the best methods for practical applications is provided. The reader will: Gain a unified, deep and systematic understanding of the state-of-the-art technologies for robust speech recognition Learn the links and relationship between alternative technologies for robust speech recognition Be able to use the technology analysis and categorization detailed in the book to guide future technology development Be able to develop new noise-robust methods in the current era of deep learning for acoustic modeling in speech recognition The first book that provides a comprehensive review on noise and reverberation robust speech recognition methods in the era of deep neural networks Connects robust speech recognition techniques to machine learning paradigms with rigorous mathematical treatment Provides elegant and structural ways to categorize and analyze noise-robust speech recognition techniques Written by leading researchers who have been actively working on the subject matter in both industrial and academic organizations for many years
Categories: Technology & Engineering

Automatic Speech Recognition

Automatic Speech Recognition

Speech Recognition has a long history of being one of the difficult problems in Artificial Intelligence and Computer Science.

Author: Kai-Fu Lee

Publisher: Springer Science & Business Media

ISBN: 9781461536505

Category: Technology & Engineering

Page: 207

View: 823

Speech Recognition has a long history of being one of the difficult problems in Artificial Intelligence and Computer Science. As one goes from problem solving tasks such as puzzles and chess to perceptual tasks such as speech and vision, the problem characteristics change dramatically: knowledge poor to knowledge rich; low data rates to high data rates; slow response time (minutes to hours) to instantaneous response time. These characteristics taken together increase the computational complexity of the problem by several orders of magnitude. Further, speech provides a challenging task domain which embodies many of the requirements of intelligent behavior: operate in real time; exploit vast amounts of knowledge, tolerate errorful, unexpected unknown input; use symbols and abstractions; communicate in natural language and learn from the environment. Voice input to computers offers a number of advantages. It provides a natural, fast, hands free, eyes free, location free input medium. However, there are many as yet unsolved problems that prevent routine use of speech as an input device by non-experts. These include cost, real time response, speaker independence, robustness to variations such as noise, microphone, speech rate and loudness, and the ability to handle non-grammatical speech. Satisfactory solutions to each of these problems can be expected within the next decade. Recognition of unrestricted spontaneous continuous speech appears unsolvable at present. However, by the addition of simple constraints, such as clarification dialog to resolve ambiguity, we believe it will be possible to develop systems capable of accepting very large vocabulary continuous speechdictation.
Categories: Technology & Engineering

Techniques for Noise Robustness in Automatic Speech Recognition

Techniques for Noise Robustness in Automatic Speech Recognition

Automatic. Speech. Recognition. Bhiksha Raj1, Tuomas Virtanen2, Rita Singh1 1Carnegie Mellon University, USA 2Tampere ... As mentioned earlier in Section 1.1, ASR systems often make errors in conditions in which a human listener could ...

Author: Tuomas Virtanen

Publisher: John Wiley & Sons

ISBN: 9781119970880

Category: Technology & Engineering

Page: 496

View: 787

Automatic speech recognition (ASR) systems are finding increasing use in everyday life. Many of the commonplace environments where the systems are used are noisy, for example users calling up a voice search system from a busy cafeteria or a street. This can result in degraded speech recordings and adversely affect the performance of speech recognition systems. As the use of ASR systems increases, knowledge of the state-of-the-art in techniques to deal with such problems becomes critical to system and application engineers and researchers who work with or on ASR technologies. This book presents a comprehensive survey of the state-of-the-art in techniques used to improve the robustness of speech recognition systems to these degrading external influences. Key features: Reviews all the main noise robust ASR approaches, including signal separation, voice activity detection, robust feature extraction, model compensation and adaptation, missing data techniques and recognition of reverberant speech. Acts as a timely exposition of the topic in light of more widespread use in the future of ASR technology in challenging environments. Addresses robustness issues and signal degradation which are both key requirements for practitioners of ASR. Includes contributions from top ASR researchers from leading research units in the field
Categories: Technology & Engineering

Acoustical and Environmental Robustness in Automatic Speech Recognition

Acoustical and Environmental Robustness in Automatic Speech Recognition

Unconstrained automatic speech recognition (ASR) is a very difficult problem. Early ASR systems obtained a reasonable performance by artificially constraining the problem. Those systems were speaker dependent, and dealt with isolated ...

Author: A. Acero

Publisher: Springer Science & Business Media

ISBN: 9781461531227

Category: Technology & Engineering

Page: 186

View: 879

The need for automatic speech recognition systems to be robust with respect to changes in their acoustical environment has become more widely appreciated in recent years, as more systems are finding their way into practical applications. Although the issue of environmental robustness has received only a small fraction of the attention devoted to speaker independence, even speech recognition systems that are designed to be speaker independent frequently perform very poorly when they are tested using a different type of microphone or acoustical environment from the one with which they were trained. The use of microphones other than a "close talking" headset also tends to severely degrade speech recognition -performance. Even in relatively quiet office environments, speech is degraded by additive noise from fans, slamming doors, and other conversations, as well as by the effects of unknown linear filtering arising reverberation from surface reflections in a room, or spectral shaping by microphones or the vocal tracts of individual speakers. Speech-recognition systems designed for long-distance telephone lines, or applications deployed in more adverse acoustical environments such as motor vehicles, factory floors, oroutdoors demand far greaterdegrees ofenvironmental robustness. There are several different ways of building acoustical robustness into speech recognition systems. Arrays of microphones can be used to develop a directionally-sensitive system that resists intelference from competing talkers and other noise sources that are spatially separated from the source of the desired speech signal.
Categories: Technology & Engineering

Automatic Speech Recognition on Mobile Devices and over Communication Networks

Automatic Speech Recognition on Mobile Devices and over Communication Networks

This book brings together academic researchers and industrial practitioners to address the issues in this emerging realm and presents the reader with a comprehensive introduction to the subject of speech recognition in devices and networks.

Author: Zheng-Hua Tan

Publisher: Springer Science & Business Media

ISBN: 9781848001435

Category: Technology & Engineering

Page: 402

View: 540

The advances in computing and networking have sparked an enormous interest in deploying automatic speech recognition on mobile devices and over communication networks. This book brings together academic researchers and industrial practitioners to address the issues in this emerging realm and presents the reader with a comprehensive introduction to the subject of speech recognition in devices and networks. It covers network, distributed and embedded speech recognition systems.
Categories: Technology & Engineering

Invariant Features and Enhanced Speaker Normalization for Automatic Speech Recognition

Invariant Features and Enhanced Speaker Normalization for Automatic Speech Recognition

Within the ASR system this functionality is realized with a filter bank that yields a time-frequency representation of the input speech signal. ... Automatic speech recognition processes discrete-time (speech) signals.

Author: Florian Müller

Publisher: Logos Verlag Berlin GmbH

ISBN: 9783832533199

Category: Technology & Engineering

Page: 247

View: 448

Automatic speech recognition systems have to handle various kinds of variabilities sufficiently well in order to achieve high recognition rates in practice. One of the variabilities that has a major impact on the performance is the vocal tract length of the speakers. Normalization of the features and adaptation of the acoustic models are commonly used methods in speech recognition systems. In contrast to that, a third approach follows the idea of extracting features with transforms that are invariant to vocal tract lengths changes. This work presents several approaches for extracting invariant features for automatic speech recognition systems. The robustness of these features under various training-test conditions is evaluated and it is described how the robustness of the features to noise can be increased. Furthermore, it is shown how the spectral effects due to different vocal tract lengths can be estimated with a registration method and how this can be used for speaker normalization.
Categories: Technology & Engineering

Automatic Speech Recognition of Arabic Phonemes with Neural Networks

Automatic Speech Recognition of Arabic Phonemes with Neural Networks

Focusing on voicing and nasalization will have the following: • Voicing and nasalization • [+voice] = voiced [−voice] ... This survey is under the heading “Automatic Speech Recognition of Arabic Phonemes with Neural Networks: A ...

Author: Mohammed Dib

Publisher: Springer

ISBN: 9783319977102

Category: Technology & Engineering

Page: 144

View: 836

This book presents a contrastive linguistics study of Arabic and English for the dual purposes of improved language teaching and speech processing of Arabic via spectral analysis and neural networks. Contrastive linguistics is a field of linguistics which aims to compare the linguistic systems of two or more languages in order to ease the tasks of teaching, learning, and translation. The main focus of the present study is to treat the Arabic minimal syllable automatically to facilitate automatic speech processing in Arabic. It represents important reading for language learners and for linguists with an interest in Arabic and computational approaches.
Categories: Technology & Engineering