A new Georgia Tech and Stanford study shows automatic speech recognition (ASR) models, used in voice assistants like Amazon Alexa, may not be as accurate when transcribing English speakers with a minority dialect.
However, the study found the transcription of Standard American English (SAE) “significantly outperformed” three dialects: Spanglish, Chicano English and African American Vernacular English.
On Thursday’s edition of “Closer Look,” Camille Harris, a Ph.D. candidate in computer science at Georgia Institute of Technology and lead author of the study, discussed some of the key findings. Harris also talked about the impact of code-switching on ASR technology and the need for more diverse representation in tech.