Nonlinear associative memory models

These notes have not yet been finalized for Winter, 1999.

Readings

Required reading: Chapters 15, 16 and 17

The Brain-State-in-a-Box (BSB) model

Iterative learning algorithms. Algorthms like back-propagation, competitive learning and the SOFM and Boltzmann machine are very powerful. Multilayer nonlinear networks are capable of representing virtually any mapping (in theory, though in practice, actually learning the mapping may take a very long time if it is even possible). SOFM's and competitive networks may be thought of as finding "clusters" in data, and in an approximate sense, they are discovering a representation of the probability density of the data. That is, if there are enough units in the network, in each region of the input patterns where the density is high (there are many patterns), the network is likely to allocate a unit whose weight vector represents the centre of that region. However, all of the learning procedures mentioned above are inherently iterative. Thus, they require many passes through a training set. In contrast, humans, at least to some degree, can learn new material after a single exposure.

Associative memory models are different; they can perform one-shot learning. So far we have seen two examples of one-shot learning devices:

Although associative memory models are considered "weak learning devices" (i.e. they have low capacity), they have a number of advantages:

The BSB model is very similar to a Hopfield network, in that it is a nonlinear associative memory model with recurrent connections. Like the Hopfield model, it can also be trained with a simple Hebbian learning rule. (It can also be trained using a Widrow-Hoff or delta rule, but we didn't discuss that in class.) However, Anderson generalizes the simple binary Hopfield network in an important way: the units have continous real-valued states. The units' states are computed according to a "LIMIT function" (see text, p 504-506) of the total weighted summed input. This function is roughly sigmoidal in shape. States are updated gradually, so that for each unit, the old state is gradually replaced by the influence of the incoming activations. As in the case of the linear associator, the state dynamics tend to cause the network to converge to eigenvectors of the weight matrix. So if the initial state is close to an eigenvector with a large eigenvalue, it will converge rapidly to something close to that eigenvector. However, because of the LIMIT function, the final state will actually consist of all 1's and -1's. If there are only 3 units, this would correspond to the corners of a cube in state-space. For higher dimensions, the network states are bounded by the corners of a high-dimensional hypercube. That's why this model is called brain-state-in-a-box.

Applications of the BSB model:


[Previous] [Parent]