Artificial Intelligence powered by deep neural networks (DNNs) has shown great potential to be applied to a wide range of industry sectors. Due to DNNs’ high computational complexity, energy efficiency has ever-increasing importance in the design of future DNN processing systems. However, there is currently no standard to follow for DNN processing; the fast moving pace in new DNN algorithm and application development also requires the hardware to stay highly flexible for different configurations. These factors open up a large design space of potential solutions with optimized efficiency, and a systematic approach becomes crucial.
To solve this problem, we address the co-optimization among the three most important pillars in the design of DNN processing systems: architecture, algorithm, and implementation. First, we present Eyeriss, a fabricated chip that implements a novel data flow architecture targeting energy-efficient data movement in the processing of DNNs. Second, we develop Energy-Aware Pruning (EAP), a new strategy of removing weights in the network to reduce computation so that it becomes more hardware-friendly and yields higher energy efficiency. Finally, we present a tool to realize fast exploration of the architecture design space under different implementation and algorithmic constraints.