Efficient Speech Recognition Engine (ESE)

Principal Investigator Song Han

ESE takes the approach of EIE one step further to address not only feedforward neural networks but also recurrent neural networks (RNN and LSTM). The recurrent nature produces complicated data dependency, which is more challenging than feedforward neural nets. To deal with this problem, we designed a data flow that can effectively schedule the complex LSTM operations using multiple EIE cores. ESE also present an effective model compression algorithm for LSTM with hardware efficiency considerations, compressed the LSTM by 20x without hurting accuracy. Implemented on Xilinx XCKU060 FPGA running at 200MHz, ESE has a processing power of 282 GOPS/s working directly on a compressed sparse LSTM network, corresponding to 2.52 TOPS/s on an uncompressed dense network.