ディープラーニングジョージ-CS 231 nラーニング(図5)

7179 ワード

テキストリンク

勉強する前に最後の復習!温度調節が出来ないミラクル朝友達の尚賢と一緒に
日差しが明るくて、私たちの肌は青白くて、目の底は黒いです......
健康を保つためにMiracleの朝が始まった.ああ一古ああ一古...

Lecture 5 - Convolutional Neural Networks

(1) History of Neural Networks

1.

2. 3. Backpropogation (1986)

4. Reinvigorated research in Deep Learning (2006)

5. First Strong Results of NNs (2012)

Speech Recognition, Image Recognition, ... introduced first convolutional NNs and dramatically reduced errors & loss

(2) History of Image Processing

1. Hubel & Wiesel

Topographical mapping in the cortex : nearby cells in cortex represent nearby regions in the visual field
Discovery of Hierarchical Organization :

2. Neocognitron - Fukushima (1980)

first example of network architectrue/model that had idea of simple and complex cells (Hubel & Wiesle).
Alternating layers of simple and complex cells - simple (modifiable parameters) and complex on top (pooling - invariant to different minor modifications from simple cells)

3. Gradient-based learning applied to Document Recognition - LeCun et al., (1998)

applying of backpropogation and gradient-based learning to train convolutional NNs, which did well in recognizing documents especially zip codes (actually used in postal services!)
しかし複雑なデータはまだありません

4. "Alexnet" ImageNet Classification with Deep Concolutional NNs (2012)

5. Today

-ConvNets used everywhere ! Detecting images, and segmentations
(labeling every pixel and outlining)

ex) face-recognition, video classification, pose recognition (joints, ...), street sign recognition, aerial maps (segmenting streets, buildings), image captioning (writing sentence description of image), artwork created by NN

(3) Convolutional Layer

1. Fully Connected Layer

input - stretched out 32x32 image to 3072x1 vector
weight matrix - 10x3072 weights
activation/output layer - 1x10

2. Convolutional Layer

1. instead of stretching out the image to a one long vector, we keep the dimensions.

これはどちらがいいですか.構造的に違うだけじゃないですか?

高効率

入力-寸法は変わらず、32 x 32 x 3画像(RGBですか?)

filter - filters always extend the full depth of input volume, 5x5x3

2. Dot production of this filter, and a chunk of image

first - overlay the filter on top of a spatial location of image,

second - do the dot product - multoplication of each element of filter with each corresponding element on spatial location of image

number of multiplications : 5 x 5 x 3

(W transpose * X) + bias

		Questions

1. when we do the dot product, do we turn the 5x5x3 to vector?
Yes, you can think of it as plopping the filter on and doing the element-wise multiplication at each location. But stretching out the filter and stretching out the input volume will give you same results.

  2. any intuition for why this is a W transpose? 
  No intuition. Just a notation to make the math work as a dot product - 1D vector
  
  3. is W not 5x5x3 but 75x1? 
  Yes, stretching out is needed before dot product multiplication -

3. Overlaying filter on top of image - convolve (slide) over all spatial locations

start from upper-lef corner and center the fulter on top of every pixel in this input volume

each filter is looking for a certain type of template or concept in the input volume

このフィルターはactivationfunctionに似ていますか?さもないと直線層ですか?ラヤーじゃない1つのレイヤで複数のフィルタと呼ばれます.
では、分類されたフィルタのために?eX)画像被巻き確率のフィルタ,画像被巻き確率のフィルタ...
activation mapが28 x 28 x 1の理由は、filterがエラーを避けるために並べ替えを繰り返すため、32-5-1?そうですか.これでもいいですか.各要素にはweightがありますが、少ない要素もありますか?私の理解は間違っていますか...ははは
これは
do this for multiple filters!

number of activation maps = number of filters !

4. Preview of Convnet

Example of Activation Maps

top : row of 5x5 filter

filter with red box - oriented edge template. Sliding over a image, white value (high value) on location of edges (where this template is more present in image)

filter slides over image, computes dot product at each location

Example of car image processing

3. Spatial Dimensions - Closer Look of Sliding

7x7 image with 3x3 filter

streepn-たぶんn格が簡単なら
stride 1 : 5x5 output size
stride 2 : 3x3 output size
stride 3 : doesn't fit! asymmetrical outputs - not all designs are possible

Output Sizes

Zero Padding

if you pad your pixels-フレームを描いて出力を調整します!さっき好奇心があった!
Maintaining output size, applying filters to edges

After padding、9 x 9になり、
ここに3 x 3を挿入すると、7 x 7が出力されます.

	Question
    1. What's the actual number of outputs?
    In this case, 7x7x (number of filters) 
    
    depth 어디 갔지ㅣ.. 
    
    2. how does this connect to input with depth?
    this example은 보기 쉬우라고 2D로 한거고, depth 도 곱하면 되세요~
    3. other ways thatn zero-padding exists
    
    4. non-square images?

size shorting shortfastは、情報を失っていることを意味します.これはよくありません:(.画質が少し低い).
これを解決したのはpadding?保護膜を縮小する.
保護フィルムを縮小する

padding size? whatever fits your model/image size&filter size

with padding - filter can be bigger than image

アクティブ化図は画像より大きくてもいいですか?O O O Oパズルでいいです!!!

Calculating Output Volume Size

Size : 32x32(zero-padding)x10(filters)
Number of Parameters : ((5x5x3) +1) x 10 = 760
(5x5x3 weights + 1 biases ) x 10 filters

1 x 1 Convolution Layes? YES!

複数のピクセルから特性を奪うという意味は1 x 1ですか?2次元では意味がありません.
でも意味があるのは「depth」から始まる3次元!
四角形

(4) ConvNet

Poyoung=downsampling(最大プール)

個の

のみ

pooling filter

は小さくなりますが、効果的に意味のある情報が含まれています.
畳み込みNNの利点−1 D−脈絡ではなく2 Dから2 Dを示す
6強になる前に考えなければならないことは?
なぜactivization関数を使用するのですか?

Reference

この問題について(ディープラーニングジョージ-CS 231 nラーニング(図5)), 我々は、より多くの情報をここで見つけました https://velog.io/@perla0328/딥러닝-조지기-CS231n-스터디-Lecture-5

テキストは自由に共有またはコピーできます。ただし、このドキュメントのURLは参考URLとして残しておいてください。

Collection and Share based on the CC Protocol

Docker CPUリソース制限——CPUスライス機能テスト

HDU_2017——統計数字文字出現個数