521 lines
72 KiB
Plaintext
521 lines
72 KiB
Plaintext
|
Deep Residual Learning for Image Recognition
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Kaiming He Xiangyu Zhang Shaoqing Ren Jian Sun
|
|||
|
Microsoft Research
|
|||
|
fkahe, v-xiangz, v-shren, jiansung@microsoft.com
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
arXiv:1512.03385v1 [cs.CV] 10 Dec 2015 Abstract 20 20
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
training error (%) Deeper neural networks are more difficult to train. We
|
|||
|
|
|||
|
|
|||
|
test error (%) 56-layer
|
|||
|
|
|||
|
present a residual learning framework to ease the training 20-layer 10 10
|
|||
|
56-layer of networks that are substantially deeper than those used
|
|||
|
previously. We explicitly reformulate the layers as learn- 20-layer
|
|||
|
ing residual functions with reference to the layer inputs, in- 0 0 01 2
|
|||
|
iter. (1e4) 3 4 5 6 0 1 2 iter. (1e4) 3 4 5 6
|
|||
|
stead of learning unreferenced functions. We provide com- Figure 1. Training error (left) and test error (right) on CIFAR-10
|
|||
|
prehensive empirical evidence showing that these residual with 20-layer and 56-layer “plain” networks. The deeper network
|
|||
|
networks are easier to optimize, and can gain accuracy from has higher training error, and thus test error. Similar phenomena
|
|||
|
considerably increased depth. On the ImageNet dataset we on ImageNet is presented in Fig.4.
|
|||
|
evaluate residual nets with a depth of up to 152 layers—8 |