The Perceptron Model
These notes have not yet been finalized for Winter, 1999.
Readings
Required reading: Chapter 8.
History of neural networks
- 1940's: Neural network models
- 1943, McCulloch & Pitts publish their paper "A Logical Calculus of the
IDeas Immanent in Nervous Activity" on their model neuron as a binary
threshold logic devic
- 1949, Hebb publishes his book "The Organization of Behavior" in which he
describes what is now known as "Hebbian cell assemblies"
- 1950's: Learning in neural networks
- the Hebbian learning rule
- 1951, Minsky publishes work on a Reinforcement learning machine
- 1960's: The age of the Perceptron (a period of massive enthusiasm)
- 1962 Rosenblatt published Principles of Neurodynamics, in which he
describes the Perceptron learning procedure
- Many wild claims are made by Rosenblatt and others about the potential
of Perceptrons as all-powerful learning devices
- 1970's: Limitations of Perceptrons are realized (the dark ages)
- 1969: Minsky and Papert's book "Perceptrons" is published, in which it
is shown that Perceptrons are only capable of learning a very limited class
of functions.
- Minsky & Papert predict that there will be no fruitful or interesting
extensions of Perceptrons even if multi-layer learning procedures are
developed
- The flow of funding into neural networks temporarily ceases
- 1980's: The discovery of back-propagation (the Renaissance)
- Back-propagation and other learning procedures for multi-layer neural
networks are invented
- The power of neural networks begins to be realized
- The hype cranks up again ...
Perceptrons
- The Perceptron learning procedure: an example of a supervised,
error-correcting learning procedure
- Perceptron convergence theorem
- Linearly separable problems: can be separated by a hyperplane
- Problems that are not linearly separable:
- XOR, connectedness
- any recognition problem that is invariant under a class of linear
transformations, e.g. translation-invariant pattern recognition
- Matlab demo
- encoding the thresholds as "bias weights" on an input line with a
constant activation of 1
- classification networks with linear threshold units can be viewed as
fitting a hyperplane to the decision boundary between classes (see figure below)
- Matlab code for a perceptron learning model:
This graph shows the decision surface learned by the simple Perceptron defined
in the matlab code.
The surface is defined by the equation w1 x1 +
w1 x2 + w3 x3 + bias = 0, where the w's are the
weights and the x's are the inputs. The planar surface
indicates which input points (x1,x2,x3) will result in the
total input to the unit being zero, and it is orthogonal to the weight vector.
Points in the input space above this plane
will have a positive dot-product with the weight vector, and therefore cause the
unit to become active. Points in the input space below the surface will have a
negative dot product with the weight vector and therefore will not activate
the unit. Thus, this
critical region defines the classification boundary learned by the perceptron.
For the linearly separable data set shown in the graph, the perceptron was
able to classify the data perfectly.