Multi-layer perceptron¶

in a nutshell¶

Hopfield vs MLP¶

Hopfield
- Non supervised
  - Information retrieval
- Learning and retrieval
MLP
- Supervised
  - Classification
  - Regression
- Training and test

The neural units¶

The activiation function¶

https://deepai.org/machine-learning-glossary-and-terms/sigmoid-function

Popular activation functions¶

The network structure¶

https://becominghuman.ai/multi-layer-perceptron-mlp-models-on-real-world-banking-data-f6dd3d7e998f

Feed forward the information¶

output of first layer

nth layer output

measure error of the output with the target

$E_{total} = \sum \frac{1}{2}(target - output)^{2}$

The learning algorithm (backpropagation)¶

https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/

Backpropagation pseudocode¶

The XOR problem¶

https://dev.to/jbahire/demystifying-the-xor-problem-1blk

In [5]:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error

# Parameters, neurons: input, hidden, output
N_i = 2; N_h = 4; N_o = 1

# XOR input
r_i = np.matrix('0 1 0 1; 0 0 1 1')

# XOR output
r_d = np.matrix('0 1 1 0')

r_i.T, r_d.T

Out[5]:

(matrix([[0, 0],
         [1, 0],
         [0, 1],
         [1, 1]]),
 matrix([[0],
         [1],
         [1],
         [0]]))

# Initialize randomly the weights
# Hidden layer
w_h = np.random.rand(N_h,N_i) - 0.5
# Output layer
w_o=np.random.rand(N_o,N_h) - 0.5
training_steps = 10000
mse = []

for i in range(training_steps):
    # Select training pattern randomly
    i = np.floor(4*np.random.rand()).astype('int')
    # Feed-forward the input to hidden layer
    r_h = 1 / (1 + np.exp(-w_h*r_i[:,i]))
    # Feed-forward the input to the output layer
    r_o = 1 / (1 + np.exp(-w_o*r_h))
    # Calculate the network error
    d_o = (r_o*(1-r_o)) * (r_d[:,i] - r_o)
    # Calculate the responsability of the hidden network in the error
    d_h = np.multiply(np.multiply(r_h, (1-r_h)), (w_o.T*d_o))
    # Update weights
    w_o = w_o + 0.7*(r_h*d_o.T).T
    w_h = w_h + 0.7*(r_i[:,i]*d_h.T).T
    # Test all patterns
    r_o_test = 1 / (1 + np.exp(-w_o*(1/(1+np.exp(-w_h*r_i)))))
    mse += [mean_squared_error(r_d, r_o_test)]

plt.plot(mse)

Multi-layer perceptron¶

in a nutshell¶

Hopfield vs MLP¶

The neural units¶

The activiation function¶

Popular activation functions¶

The network structure¶

Feed forward the information¶

The learning algorithm (backpropagation)¶

Backpropagation pseudocode¶

The XOR problem¶

Sources and resources¶