Pruning and Sparse NN

Principal Investigator Song Han

Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems with limited hardware resource. Conventional networks fix the architecture before training starts; as a result, training cannot improve the architecture. To address these limitations, we describe a method to reduce the storage and computation required by neural networks by an order of magnitude by learning only the important connections. This reduced the number of parameters of AlexNet by a factor of 9×, that of VGGNet by 13× without affecting their accuracy.