Entry Date:
January 22, 2019

Small-Footprint Automatic Speech Recognition Circuit

Principal Investigator Anantha Chandrakasan


With the advanced technology of speech and natural language processing, spoken language has become a feasible way for human-machine interaction. Due to the high complexity of articulated speech signal, automatic speech recognition (ASR) generally requires intensive computation and memory size to achieve good performance. However, due to its widespread applications on robots, wearables, and mobile devices, it’s desirable to design circuit to implement ASR locally in a resource-limited environment, particularly in which power consumption is a critical concern.

In this work, we first scrutinize software speech recognition procedure; evaluate the memory and computational resource needed when transferring to hardware, and take advantage of circuit design to minimize size and power usage. We design small-footprint ASR system with cutting-edge neural network that can best perform acoustic modeling with memory restrictions, along with weight truncation and quantization. Dedicated arithmetic unit design, parallelization, and resource dispatching further reduce latency. We implement weighted finite-state transducer (WFST) to incorporate the phonetic probability with language model to select the best word transcription. Model compression, caching, and lattice truncation are adopted to adapt the ASR to circuit and optimize the design.

The ASR design leveraging powerfulness and robustness of neural network in hybrid ASR model outperforms conventional model in recognition accuracy, whereas conducting ASR tasks on-chip sees great reduction in power compared to CPU. We show a 2.4X reduction in neural network weight size compared to previous hardware design. Work demonstrates the feasibility to operate an ASR in a small-footprint environment in applications with small vocabulary size and optimized model.