AS you go deeper in the network the height and width of the images tend to go down but the number of channels tend to go up
Conv ⇒ Avg Pool ⇒ Conv ⇒ Avg Pool ⇒ FC ⇒ FC ⇒ Output
60k parameters
Uses Sigmoid/TanH activation function
Conv ⇒ Max Pool ⇒ 5x5 Same Conv ⇒ Max Pool ⇒ 3x3 Same Conv ⇒ 3x3 Same Conv ⇒ ⇒ 3x3 Same Conv ⇒ Max Pool ⇒ FC ⇒ FC ⇒ Softmax
60M parameters
Uses RELU activation function
Local Response Normalization
138M parameters
The architecture is quite uniform and simple