Models of Neuron

in this video I'm going to describe some relatively simple modelsof neurons I'll describe a number of different models starting with simple linear and threshold neurons and thendescribing slightly more complicated models these are much simpler than real neurons but they're still complicatedenough to allow us to make neural Nets that do some very interesting kinds of machinelearning in order to understand anything complicated we have to idealize it that

Idealized neurons

is we have to make simplifications that allow us to get a handle on how it might work with atomsfor example we simplify them as behaving like little solar systemsidealization removes the complicated details that are not essential for understanding the mainprinciples it allows us to apply mathematics and to make analogies to other familiarsystems and once we understand the basic principles it's easy to add complexityand make the model more faithful to reality of course we have to be careful when we idealize something not to removethe thing that's giving it its main properties it's often worth understanding models that are known tobe wrong as long as we don't forget they're wrong so for example a lot of work on neural networksuses neurons that communicate real values rather than discrete spikes of activity and we know cortical neuronsdon't behave like that but it's still worth understanding systems like that and in practice they can be very usefulfor machine learning the first kind of neuron I want to tell you about is the simplest it's alinear neuron it's simple it's computationally limited in what it can do it may allowus to get insights into more complicated neurons um but it may be somewhatmisleading so in a linear neuron the output Y is a function of a bias of the neuronB and the sum over all its incoming connections of the activity on an inputLine times the weight on that line that's the synaptic weight on the input line and if we plot that as acurve then if we plot on the x-axis the bias plus the weighted activities on theinput lines we get a straight line that goes throughzero very different from linear neurons are binary threshold neurons that were introduced by mullik and pits theyactually influence Von nman when he was thinking about how to design a universal

Binary threshold neurons

computer in a binary threshold neuron you first computer weighted some of theinputs and then you send out a spike of activity if that weighted sum exceedsthe threshold mullik and pittz thought that the spikes were like the truth values ofpropositions so each neuron is combining the truth values he gets from other neurons to produce a truth value of itsown and that's like combining some propositions to compute the truth value of anotherproposition at the time in the 1940s logic was the main Paradigm for how themind might work um since then peoplethinking about how the brain computes have become much more interested in the idea the brain is combining lots ofdifferent sources of unreliable evidence and so logic isn't such a good Paradigm for what the brain's upto for a binary threshold neuron you can think of it input output function as ifthe weighted input is above the threshold it gives an output of one otherwise it gives an output ofzero there are actually two equivalent ways to write the equations for a binary thresholdneuron we can say that the total input Z is just the activities on the inputlines times the weights and then the output Y is one ifthat Z is above the threshold and zero otherwise alternativelywe could say that the total input includes a bias term so the total input is what comes in on the input linestimes the weight plus this bias term and then we can say the output is one ifthat total input is above zero and is zero otherwise and the equivalence is simply that the threshold in the firstformulation is equal to the negative of the bias in the second

Rectified Linear Neurons (sometimes called linear threshold neurons)

formulation a kind of neuron that combines the properties of both linear neurons and binary threshold neurons isa rectified linear neuron it first computes a linearweighted sum of its inputs but then it gives an output that's a nonlinear function of this weightedsum so we compute Z in the same way as before if Z is below zero we give anoutput of zero otherwise we give an output that's equal to Z so above zeroit's linear and at zero it makes a hard decision so the input output curve lookslike this it's definitely not linear but above zero it is linear so with a neuronlike this we can get a lot of the nice properties of linear systems when it's above zero we can also get the abilityto make decisions um at

Sigmoid neurons

zero the neurons that we'll use a lot in this course and are probably the commonest kinds ofneurons to use in artificial neuronet are sigmoid neurons they give a realvalued output that is a smooth and bounded function of their total input it's typical to use the logisticfunction where the total input is computed as before a bias plus what comes in on the input linesweighted the output for a logistic neuron is 1/ 1 + e to the minus thetotal input if you think about that if the total input's big and positive e to the minusa big positive number is zero and so the output will be one if the total input isbig and negative e to the minus a big negative number is a large number and so theoutput will be zero so the input output function looks likethis when the total input zero e to Theus Z is one so the outputs arehalf and the nice thing about a sigmoid is it has smooth derivatives thederivatives change continuously and so they're nicely behaved and they make it easy to dolearning as we'll see in lecture three finally the stochastic binary

Stochastic binary neurons

neurons they use just the same equations as logistic units they compute theirtotal input the same way and they use the logistic function to compute a real value which is the probability that theywould output a spike but then instead of outputting that probability is a real number theyactually make a probabilistic decision and so what they actually output is either a one or a zero they'reintrinsically random so they're treating the p as the probability of producing a one not as areal number of course if the input is very big and positive they will almost alwaysproduce a one and if the input's big and negative they'll almost always produce azero we can do a similar trick with rectified linear units we can say thatthe output this real value that comes out of a rectified linear unit if its input is abovezero is the rate of producing spikes so that'sdeterministic but once we've figured out this rate of producing spikes the actualtimes at which spikes are produced is a random process it's a Pon process so therectified linear unit determines the rate but intrinsic Randomness in the unit determines when the spikes areactually produced.

Last updated