|
|
Supervised Learning in Neural NetworksPerceptronsThere are several different models of supervised learning that have been implemented in artificial neural networks. Perhaps the most simple of these aims to train the Threshold Logic Unit (TLU). Consider a single TLU with two inputs connected to a single output. Using the mathematical notation already introduced: The initial weights are set randomly. Now, suppose the network is attempting to learn to perform the AND operation on its two inputs. One way to think of this is as a classification decision. For example, imagine that people can be identified as happy or sad by their sex (man or woman) and marital status (married or single). We can present this information to our network as a 2-dimensional binary input vector where the first element of the vector indicates sex (man = 0, woman = 1) and the second element indicates the marital status (married = 0, single = 1). At the output, happy people = 1 and sad people = 0. By applying the AND operator to the inputs, we classify an individual as happy only when they are a woman AND single i.e. output is 1 only when both inputs are one.
We can combine the above with the network equations already described to express four inequalities that must be satisfied in order to "solve" the problem:
In order to learn the above the TLU overcome two difficulties. First, it is necessary to measure error. Second, it is necessary to define a procedure to reduce error by adjusting weights. However, this requires a learning rule that assesses the relative contribution of each weight to the total error. The TLU learning rule proposes a solution to these issues. The error is calculated by determining the difference between the actual output of the network and the target output. If the input is correctly classified, then the weights are left unchanged and the next input is presented. If, on the other hand, the input vector is incorrectly classified, there are two cases to consider. If the actual output was lower than desired (for example, if x1 = 1 and x2 = 1 but y = 0) then the weight of all active inputs that contributed to the output should be increased (w1 and w2 are increased). However, we don't want to make too drastic a change as this might upset previous learning so we modify weights by an increment. If the output was higher than desired (for example, if x1 = 0 and x2 = 1 but y = 1) then the weight of all active inputs should be decreased incrementally (only w2 is decreased as x1 is not active). In mathematical notation:
If x1 = 1, x2 = 1, y = 0 but t = 1 then:
If x1 = 0, x2 = 1, y = 1 but t = 0 then: Rosenblatt introduced an enhancement of the TLU outlined above known as the Perceptron. It consists of a TLU whose inputs come from a series of pre-processed units. These units can be assigned any arbitrary functionality but are fixed and cannot learn. The rest of the neurone functions exactly as described. It can be mathematically shown that the above learning rule can always discover a set of weights that correctly classifies it input, given that the set of weights exists. This is called the "Perceptron Convergence Theorem" and was responsible for much of the early interest in artificial neural networks. Unfortunately, there are limits to the Perceptrons use as documented by Minsky and Papert in their book on the subject. They showed that many functions couldn't be computed by the Perceptron or any other two-layer (input-output) network. For example, a simple Perceptron with two inputs connected to one output (as discussed above) cannot learn an exclusive-or (XOR) problem. Put simply, the Perceptron must learn to turn on the output if either of the inputs is turned on, but not when both are turned on.
We can combine the above with the network equations already described to determine four inequalities that must be satisfied by any network solution:
However, this is clearly not possible as both w1 and w2 must be greater than theta while their sum w1 + w2 would have to be less than theta. So, their are some kinds of problems that Perceptrons simply cannot be trained to do. Index | Supervised Learning | Perceptrons References: F. Rosenblatt. Principles of Neurodynamics. Spartan Books, 1962 M. Minsky and S. Papert. Perceptrons. MIT Press, 1969 © 2008 Marcus bros |
|