Predictive Coding for training deep neural networks
Predictive coding is a groundbreaking theory in the field of neuroscience which is inspired by the idea that the brain is constantly generating and updating a mental model of the environment. This model is used to predict sensory input, which is then compared with actual input signals from the senses. In practice, this translates into prediction of the training data, instead of using it to feed the input layer in deep learning models. The other main distinctive feature of this paradigm is that learning is carried out locally without backpropagating the learning errors like in traditional models. This process allows the algorithm to make sense of the world around through training examples and anticipate future events. Based on promising applications, predictive coding has the potential to revolutionize our understanding of how the brain works, and provide new insightful ideas for training artificial intelligence models.
Currently, our Ulster University’s ISRC group has proposed a novel Deep Predictive Coding (DPC) algorithm to train large-scale networks using high-performance computing. Experimental results show the algorithm can utilize locally available information to train the network and implement classification and reconstruction with fewer parameters than traditional algorithms. Thus, DPC provides an alternative learning method to implement massively parallel computation and reduce computational consumption like in biological brains, which also brings the possibility of implementing these models in mobile or edge devices that have lesser computing capabilities.
Figure 1 shows a particular architecture where weights are shared locally in forward and backward propagation stages. Particularly, calculations and learning processes are performed locally, greatly favouring large-scale parallel implementations. In contrast, in deep learning models using backpropagation, code parallelization and computing are more costly as learning error information has to be maintained and shared among distant units. Another advantage is visible in Table 1 with an application based on the use of the convolutional neural network (CNN) architecture, compared against current back propagation approaches. The results indicate that the DPC method can achieve results as excellent as benchmark methods using a much smaller amount of parameters, thus enhancing large-scale applications [6].
Figure 1: Deep predictive coding (DPC) algorithm to emulate learning characteristics and capabilities of biological brains.
Table 1: Performance comparison of DBPC with other methods on MNIST.
This research was implemented using the Northern Ireland High Performance Computing (NIHPC) cluster, particularly the NVIDIA A100 GPU devices. The algorithm was tested using the MNIST digits and fashion datasets and different settings were tuned attending to the classification performance. The NIHPC cluster was critical for implementing the research.
References:
[1] W. Sun and J. Orchard, “A predictive-coding network that is both discriminative and generative,” Neural computation, vol. 32, no. 10, pp.1836–1862, 2020.
[2] B. Millidge, A. Tschantz, A. Seth, and C. L. Buckley, “Relaxing the constraints on predictive coding models,” arXiv preprint arXiv:2010.01047, 2020.
[3] Z. Song, J. Zhang, G. Shi, and J. Liu, “Fast inference predictive coding: A novel model for constructing deep neural networks,” IEEE transactions on neural networks and learning systems, vol. 30, no. 4, pp. 1150–1165, 2018.
[4] H. Wen, K. Han, J. Shi, Y. Zhang, E. Culurciello, and Z. Liu, “Deep predictive coding network for object recognition,” in International Conference on Machine Learning. PMLR, 2018, pp. 5266–5275.
[5] L. M. Seng, B. B. C. Chiang, Z. A. A. Salam, G. Y. Tan, and H. T. Chai, “Mnist handwritten digit recognition with different cnn architectures,” Journal of Applied Technology and Innovation (e-ISSN: 2600-7304), vol. 5, no. 1, p. 7, 2021.
[6] Qiu, Senhui, Saugat Bhattacharyya, Damien Coyle, and Shirin Dora. "Deep Predictive Coding with Bi-directional Propagation for Classification and Reconstruction." arXiv preprint arXiv:2305.18472 (2023).
Contact information:
Mr Senhui Qiu
Email: Qiu-S2@ulster.ac.uk ;
Project funded by Vice-Chancellor’s Research Scholarship (VCRS) at Ulster University. We are grateful for access to the Tier 2 High Performance Computing resources provided by the Northern Ireland High Performance Computing (NI-HPC) facility funded by the UK Engineering and Physical Sciences Research Council (EPSRC), Grant No. EP/T022175, the UKRI Turing AI Fellowship 2021-2025 funded by the EPSRC (grant number EP/V025724/1).