ejeai Open Access Journal

European Journal of Emerging Artificial Intelligence

eISSN: Applied
Publication Frequency : 2 Issues per year.

  • Peer Reviewed & International Journal
Table of Content
Issues (Year-wise)
Loading…

Open Access iconOpen Access

ARTICLE

A Residual Learning Framework for Advanced Visual Recognition

1 Department of Computer Vision and Pattern Analysis Caspian Institute of Technology, Baku, Azerbaijan
2 Visual Computing and Artificial Intelligence Laboratory Skagen Technical University, Denmark

https://doi.org/10.64917/

Citations: Loading…
ABSTRACT VIEWS: 6   |   FILE VIEWS: 1   |   PDF: 1   HTML: 0   OTHER: 0   |   TOTAL: 7
Views + Downloads (Last 90 days)
Cumulative % included

Abstract

The advancement of deep learning in computer vision has been largely driven by the development of increasingly deep neural networks. However, a fundamental challenge known as the "degradation problem" has hindered progress, where adding more layers to a suitably deep model leads to higher training error, preventing the network from benefiting from increased depth. This phenomenon is not caused by overfitting but by optimization difficulties inherent in training very deep architectures. To address this, we introduce a deep residual learning framework. Instead of expecting stacked layers to learn an underlying mapping directly, we reformulate them to learn a residual function with respect to the layer inputs. This is achieved by introducing "shortcut connections" that perform identity mapping, adding the input of a block of layers to its output. This reformulation simplifies the optimization process, as it is easier for the network to learn perturbations from an identity mapping than to learn the mapping from scratch.

We provide comprehensive empirical evidence on several benchmark datasets. On the ImageNet 2012 classification dataset, our residual networks (ResNets) are substantially deeper than previous models, with up to 152 layers, yet exhibit lower complexity than shallower networks like VGG. These ResNets easily overcome the degradation problem, showing consistent accuracy gains from increased depth and achieving a 3.57% top-5 error on the ImageNet test set with an ensemble model. We also conducted experiments on the CIFAR-10 dataset, successfully training networks with over 1000 layers, demonstrating that our framework effectively resolves the core optimization issues. Furthermore, the representations learned by our deep residual networks generalize exceptionally well to other computer vision tasks. When used as a backbone for object detection on PASCAL VOC and MS COCO, our models achieve state-of-the-art results, with a 28% relative improvement on the COCO detection metric. Our findings establish deep residual learning as a fundamental and effective technique for training extremely deep neural networks, pushing the boundaries of what is possible in image recognition.


Keywords

Deep Learning, Residual Learning, Image Recognition, Convolutional Neural Networks, Degradation Problem

References

[1] Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2):157–166, 1994.

[2] C. M. Bishop. Neural networks for pattern recognition. Oxford university press, 1995.

[3] W. L. Briggs, S. F. McCormick, et al. A Multigrid Tutorial. Siam, 2000.

[4] K. Chatfield, V. Lempitsky, A. Vedaldi, and A. Zisserman. The devil is in the details: an evaluation of recent feature encoding methods. In BMVC, 2011.

[5] M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman. The Pascal Visual Object Classes (VOC) Challenge. IJCV, pages 303–338, 2010.


How to Cite

A Residual Learning Framework for Advanced Visual Recognition. (2025). European Journal of Emerging Artificial Intelligence, 2(01), 11-20. https://doi.org/10.64917/

Related articles

Share Link