lecture 8: introduction to neural networksclass.vision/96-97/08_introduction to neural...

67
Lecture 8 - SRTTU A.Akhavan Lecture 8: Introduction to Neural Networks Alireza Akhavan Pour ارشنبه چه۱۶ اسفند۱۳۹۶ 1 CLASS.VISION

Upload: others

Post on 20-May-2020

38 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan

Lecture 8: Introduction to Neural Networks

Alireza Akhavan Pour

۱۳۹۶1اسفند ۱۶–چهارشنبه

CLASS.VISION

Page 2: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 2 ۱۳۹۶اسفند ۲۲–سه شنبه

○ Neurons and Activation Functions

○ Cost Functions

○ Gradient Descent

○ Backpropagationc

Page 3: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 3 ۱۳۹۶اسفند ۲۲–سه شنبه

Introduction to

the Perceptron

Page 4: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 4 ۱۳۹۶اسفند ۲۲–سه شنبه

● Artificial Neural Networks (ANN) actually have

a basis in biology!

● Let’s see how we can attempt to mimic

biological neurons with an artificial neuron,

known as a perceptron!

Page 5: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 5 ۱۳۹۶اسفند ۲۲–سه شنبه

● The biological neuron:

Axon BodyDendrites

Page 6: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 6 ۱۳۹۶اسفند ۲۲–سه شنبه

● The artificial neuron also has inputs and

outputs!

Input 0

Input 1

Output

Page 7: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 7 ۱۳۹۶اسفند ۲۲–سه شنبه

● This simple model is known as a perceptron.

Input 0

Input 1

Output

Page 8: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 8 ۱۳۹۶اسفند ۲۲–سه شنبه

● Simple example of how it can work.

Input 0

Input 1

Output

Page 9: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 9 ۱۳۹۶اسفند ۲۲–سه شنبه

● We have two inputs and an output

Input 0

Input 1

Output

Page 10: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 10 ۱۳۹۶اسفند ۲۲–سه شنبه

● Inputs will be values of features

Input 0

Input 1

Output

12

4

Page 11: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 11 ۱۳۹۶اسفند ۲۲–سه شنبه

● Inputs are multiplied by a weight

Input 0

Input 1

Output

12

4

Page 12: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 12 ۱۳۹۶اسفند ۲۲–سه شنبه

● Weights initially start off as random

Input 0

Input 1

Output

12

4

Page 13: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 13 ۱۳۹۶اسفند ۲۲–سه شنبه

● Weights initially start off as random

Input 0

Input 1

Output

12

4

Page 14: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 14 ۱۳۹۶اسفند ۲۲–سه شنبه

● Inputs are now multiplied by weights

Input 0

Input 1

Output

12

4

Page 15: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 15 ۱۳۹۶اسفند ۲۲–سه شنبه

● Inputs are now multiplied by weights

Input 0

Input 1

Output

Page 16: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 16 ۱۳۹۶اسفند ۲۲–سه شنبه

● Then these results are passed to an activation

function.

Input 0

Input 1

Activation

FunctionOutput

Page 17: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 17 ۱۳۹۶اسفند ۲۲–سه شنبه

● Many activation functions to choose from, we’ll

cover this in more detail later!

Input 0

Input 1

Activation

FunctionOutput

Page 18: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 18 ۱۳۹۶اسفند ۲۲–سه شنبه

● For now our activation function will be very

simple...

Input 0

Input 1

Activation

FunctionOutput

Page 19: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 19 ۱۳۹۶اسفند ۲۲–سه شنبه

● If sum of inputs is positive return 1, if sum is

negative output 0.

Input 0

Input 1

Activation

FunctionOutput

Page 20: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 20 ۱۳۹۶اسفند ۲۲–سه شنبه

● In this case 6-4=2 so the activation function

returns 1.

Input 0

Input 1

Activation

Function Output

Page 21: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 21 ۱۳۹۶اسفند ۲۲–سه شنبه

● There is a possible issue. What if the original

inputs started off as zero?

Input 0

Input 1

Activation

Function Output

Page 22: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 22 ۱۳۹۶اسفند ۲۲–سه شنبه

● Then any weight multiplied by the input would

still result in zero!

Input 0

Input 1

Activation

Function Output

Page 23: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 23 ۱۳۹۶اسفند ۲۲–سه شنبه

● We fix this by adding in a bias term, in this

case we choose 1.

Input 0

Input 1

Activation

Function Output

Bias

Page 24: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 24 ۱۳۹۶اسفند ۲۲–سه شنبه

● So what does this look like mathematically?

Input 0

Input 1

Activation

Function Output

Bias

Page 25: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 25 ۱۳۹۶اسفند ۲۲–سه شنبه

● Let’s quickly think about how we can represent

this perceptron model mathematically:

Page 26: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 26 ۱۳۹۶اسفند ۲۲–سه شنبه

● Once we have many perceptrons in a network

we’ll see how we can easily extend this to a

matrix form!

Page 27: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 27 ۱۳۹۶اسفند ۲۲–سه شنبه

● Review

○ Biological Neuron

○ Perceptron Model

○ Mathematical Representation

Page 28: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 28 ۱۳۹۶اسفند ۲۲–سه شنبه

Introduction to

Neural Networks

Page 29: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 29 ۱۳۹۶اسفند ۲۲–سه شنبه

● We’ve seen how a single perceptron behaves,

now let’s expand this concept to the idea of a

neural network!

● Let’s see how to connect many perceptrons

together and then how to represent this

mathematically!

Page 30: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 30 ۱۳۹۶اسفند ۲۲–سه شنبه

● Multiple Perceptrons Network

Page 31: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 31 ۱۳۹۶اسفند ۲۲–سه شنبه

● Input Layer. 2 hidden layers. Output Layer

Page 32: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 32 ۱۳۹۶اسفند ۲۲–سه شنبه

● Input Layers

○ Real values from the data

● Hidden Layers

○ Layers in between input and output

○ 3 or more layers is “deep network”

● Output Layer

○ Final estimate of the output

Page 33: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 33 ۱۳۹۶اسفند ۲۲–سه شنبه

● As you go forwards through more layers, the

level of abstraction increases.

● Let’s now discuss the activation function in a

little more detail!

Page 34: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 34 ۱۳۹۶اسفند ۲۲–سه شنبه

● Previously our activation function was just a

simple function that output 0 or 1.

z = wx + b

Output

0

1

0

Page 35: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 35 ۱۳۹۶اسفند ۲۲–سه شنبه

● This is a pretty dramatic function, since small

changes aren’t reflected.

z = wx + b

Output

0

1

0

Page 36: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 36 ۱۳۹۶اسفند ۲۲–سه شنبه

● It would be nice if we could have a more

dynamic function, for example the red line!

z = wx + b

Output

0

1

0

Page 37: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 37 ۱۳۹۶اسفند ۲۲–سه شنبه

● Lucky for us, this is the sigmoid function!

z = wx + b

Output

0

1

0

Page 38: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 38 ۱۳۹۶اسفند ۲۲–سه شنبه

● Changing the activation function used can be

beneficial depending on the task!

z = wx + b

Output

0

1

0

Page 39: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 39 ۱۳۹۶اسفند ۲۲–سه شنبه

● Let’s discuss a few more activation functions

that we’ll encounter!

z = wx + b

Output

0

1

0

Page 40: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 40 ۱۳۹۶اسفند ۲۲–سه شنبه

● Hyperbolic Tangent: tanh(z)

z = wx + b

Output

-1

1

0

Page 41: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 41 ۱۳۹۶اسفند ۲۲–سه شنبه

● Rectified Linear Unit (ReLU): This is actually a

relatively simple function: max(0,z)

z = wx + b

Output

0

Page 42: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 42 ۱۳۹۶اسفند ۲۲–سه شنبه

● ReLu and tanh tend to have the best

performance, so we will focus on these two.

● Deep Learning libraries have these built in for

us, so we don’t need to worry about having to

implement them manually!

Page 43: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 43 ۱۳۹۶اسفند ۲۲–سه شنبه

Cost Functions

Page 44: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 44 ۱۳۹۶اسفند ۲۲–سه شنبه

● Let’s now explore how we can evaluate

performance of a neuron!

● We can use a cost function to measure how

far off we are from the expected value.

Page 45: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 45 ۱۳۹۶اسفند ۲۲–سه شنبه

● We’ll use the following variables:

○ y to represent the true value

○ a to represent neuron’s prediction

● In terms of weights and bias:

○ w*x + b = z

○ Pass z into activation function σ(z) = a

Page 46: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 46 ۱۳۹۶اسفند ۲۲–سه شنبه

● Quadratic Cost

○ C = Σ(y-a)2 / n

● We can see that larger errors are more

prominent due to the squaring.

● Unfortunately this calculation can cause a

slowdown in our learning speed.

Page 47: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 47 ۱۳۹۶اسفند ۲۲–سه شنبه

● Cross Entropy

○ C = (-1/n) Σ (y⋅ln(a) + (1-y)⋅ln(1-a)

● This cost function allows for faster learning.

● The larger the difference, the faster the neuron

can learn.

Page 48: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 48 ۱۳۹۶اسفند ۲۲–سه شنبه

● We now have 2 key aspects of learning with

neural networks, the neurons with their

activation function and the cost function.

● We’re still missing a key step, actually

“learning”!

Page 49: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 49 ۱۳۹۶اسفند ۲۲–سه شنبه

● We need to figure out how we can use our

neurons and the measurement of error (our

cost function) and then attempt to correct our

prediction, in other words, “learn”!

Page 50: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 50 ۱۳۹۶اسفند ۲۲–سه شنبه

Gradient Descent

and Backpropagation

Page 51: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 51 ۱۳۹۶اسفند ۲۲–سه شنبه

● Gradient descent is an optimization algorithm

for finding the minimum of a function.

● To find a local minimum, we take steps

proportional to the negative of the gradient.

Page 52: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 52 ۱۳۹۶اسفند ۲۲–سه شنبه

● Gradient Descent (in 1 dimension)

w

C

Page 53: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 53 ۱۳۹۶اسفند ۲۲–سه شنبه

● Gradient Descent (in 1 dimension)

w

C

Page 54: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 54 ۱۳۹۶اسفند ۲۲–سه شنبه

● Gradient Descent (in 1 dimension)

w

C

Page 55: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 55 ۱۳۹۶اسفند ۲۲–سه شنبه

● Gradient Descent (in 1 dimension)

w

C

Page 56: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 56 ۱۳۹۶اسفند ۲۲–سه شنبه

● Visually we can see what parameter value to

choose to minimize our Cost!

w

C

Page 57: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 57 ۱۳۹۶اسفند ۲۲–سه شنبه

● Finding this minimum is simple for 1

dimension, but our cases will have many more

parameters, meaning we’ll need to use the

built-in linear algebra that our Deep Learning

library will provide!

Page 58: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 58 ۱۳۹۶اسفند ۲۲–سه شنبه

● Using gradient descent we can figure out the

best parameters for minimizing our cost, for

example, finding the best values for the

weights of the neuron inputs.

Page 59: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 59 ۱۳۹۶اسفند ۲۲–سه شنبه

● We now just have one issue to solve, how can

we quickly adjust the optimal parameters or

weights across our entire network?

● This is where backpropagation comes in!

Page 60: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 60 ۱۳۹۶اسفند ۲۲–سه شنبه

● Backpropagation is used to calculate the error

contribution of each neuron after a batch of

data is processed.

● It relies heavily on the chain rule to go back

through the network and calculate these

errors.

Page 61: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 61 ۱۳۹۶اسفند ۲۲–سه شنبه

● Backpropagation works by calculating the

error at the output and then distributes back

through the network layers.

● It requires a known desired output for each

input value (supervised learning).

Page 62: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 62 ۱۳۹۶اسفند ۲۲–سه شنبه

TensorFlow Playground

Page 63: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 63 ۱۳۹۶اسفند ۲۲–سه شنبه

● Go to:

○ playground.tensorflow.org

Page 64: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 64 ۱۳۹۶اسفند ۲۲–سه شنبه

Page 65: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 65 ۱۳۹۶اسفند ۲۲–سه شنبه

𝑌 = 𝑆𝑜𝑓𝑡𝑚𝑎𝑥 ( 𝑋.𝑊 + 𝑏 )

پیش‌بینی𝒀[𝒏, 𝟏𝟎]

تصاویر𝑿[𝒏, 𝟐𝟓]

وزن‌ها𝑾[𝟐𝟓, 𝟏𝟎]

بایاسb[𝟏𝟎]

ضرب‌ماتریسی

https://fa.wikipedia.org/wiki/ هموار_بیشینه

Page 66: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan 66 ۱۳۹۶اسفند ۲۲–سه شنبه

− 𝑌𝑖′. log(𝑌𝑖)

Cross entropy:

actual probabilities, “one-hot” encoded

computed probabilities

Page 67: Lecture 8: Introduction to Neural Networksclass.vision/96-97/08_Introduction to Neural Networks.pdf · Introduction to Neural Networks. SRTTU –A.Akhavan Lecture 8-29 ۱۳۹۶ دنفسا۲۲–

Lecture 8 -SRTTU – A.Akhavan

منابع

https://docs.google.com/presentation/d/1GN2P8Kztjp_nSoNquaNRIEkYERJ_FmZ80ZP8T50bDKc/edit

https://www.slideshare.net/Alirezaakhavanpour/tensorflow-71395844

67 ۱۳۹۶اسفند ۲۲–سه شنبه