Improving the efficiency of DNN hardware accelerator by replacing digital feature extractor with an imprecise neuromorphic hardware

Mixed-signal in-memory computation can drastically improve the efficiency of the hardware implementing machine learning (ML) algorithms by (i) removing the need to fetch neural network parameters from internal or external memory and (ii) performing a large number of multiply-accumulate operations in parallel. However, this boost in efficiency comes with some disadvantages. Among them, the inability to precisely program nonvolatile memory devices (NVM) with neural network parameters and sensitivity to noise prevent the mixed-signal hardware to perform a precise and deterministic computation. Unfortunately, these hardware-specific errors can get magnified while propagating along with the layers of the deep neural network. In this paper, we show that the inability to implement parameters of the already trained network with enough precision can completely stop the network from performing any meaningful operation. However, even at this level of degradation, the feature extractor section of the network still extracts enough information from which an acceptable level of performance can be achieved by just retraining the last classification layers of the network. Our results suggest that instead of just blindly trying to implement software algorithms in hardware as precisely as possible, it might be more efficient to implement neural networks with imperfect devices and circuits and let the network itself compensate for these imprecise computations by only retraining few layers

PDF

___

1] Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint 2017. arXiv:1704.04861.
[2] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015; 521 (2015): 1-20. doi: 10.1038/nature14539
[3] Bojarski M, Del Testa D, Dworakowski D, Firner B, Flepp et al. End to end learning for self-driving cars. arXiv preprint 2016. arXiv:1604.07316.
[4] Hannun A, Case C, Casper J, Catanzaro B, Diamos G et al. Deep speech: scaling up end-to-end speech recognition. arXiv preprint 2014. arXiv:1412.5567.
[5] Gibney E. Google AI algorithm masters ancient game of go. Nature News 2016; 529 (7587): 445. doi: 10.1038/529445a
[6] Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In: Schölkopf B, Platt J, Hofmann T (editors). Advances in Neural Information Processing Systems. Cambridge, MA, USA: MIT Press, 2012, pp.1097-1105.
[7] Hrishikesh J, Raha A, Younghyun K, Soubhagya S, Woo SL et al. Energy-efficient system design for IoT devices. In: 2016 21st Asia and South Pacific Design Automation Conference; Beijung, China; 2016. pp. 298-301.
[8] Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y. Quantized neural networks: training neural networks with Low precision weights and activations. The Journal of Machine Learning Research 2017; 18 (1): 6869-6898.
[9] Han S, Mao H, Dally WJ. Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint 2015. arXiv:1510.00149.
[10] Indiveri G, Linares-Barranco B, Hamilton TJ, Van Schaik A, Cummings RE et al. Neuromorphic silicon neuron circuits. Frontiers in Neuroscience 2011; 1: 1-20. doi: 10.3389/fnins.2011.00073
[11] Prezioso M, Merrikh-Bayat F, Hoskins BD, Adam GC, Likharev KK et al. Training and operation of an integrated neuromorphic network based on metal-oxide memristors. Nature 2015; 521 (7550): 61-64. doi: 10.1038/nature14441
[12] Merrikh-Bayat F, Prezioso M, Chakrabarti B, Nili H, Kataeva I et al. Implementation of multilayer perceptron network with highly uniform passive memristive crossbar circuits. Nature Communications 2018; 9 (1): 1-7. doi: 10.1038/s41467-018-04482-4
[13] Merrikh-Bayat F, Guo X, Klachko M, Prezioso M, Likharev KK et al. High-performance mixed-signal neurocomput- ing with nanoscale oating-gate memory cell arrays. IEEE Transactions on Neural Networks and Learning Systems 2017; 29 (10): 4782-4790. doi: 10.1109/TNNLS.2017.2778940
[14] Guo X, Merrikh-Bayat F, Bavandpour M, Klachko M, Mahmoodi MR et al. Fast, energy efficient, robust, and re- producible mixed-signal neuromorphic classifier based on embedded NOR ash memory technology. In: IEEE Inter- national Electron Devices Meeting (IEDM); New York, NY, USA; 2017. pp. 1-20. doi: 10.1109/IEDM.2017.8268341
[15] Merrikh-Bayat F, Prezioso M, Chakrabarti B, Kataeva I, Strukov D. Memristor-based perceptron classifier: Increas- ing complexity and coping with imperfect hardware. In: IEEE/ACM International Conference on Computer-Aided Design (ICCAD); New York, NY, USA; 2017. pp. 549-554. doi: 10.1109/ICCAD.2017.8203825
[16] Williams RS. How we found the missing memristor. In: Adamatzky A, Chan G (editors). Chaos, CNN, Memristors and Beyond: A Festschrift for Leon Chua With DVD-ROM. World Scientific Nonlinear Science Series. Hackensack, NJ, USA: World Scientific Publishing Co., Inc., 2013, pp. 483-489.
[17] Radwan AG, Fouda ME. On the Mathematical Modeling of Memristor, Memcapacitor, and Meminductor. New York, NY, USA: Springer International Publishing, 2015