Types of Learning
in this video I'm going to talkabout three different types of machine learning supervised learning reinforcement learning and unsupervisedlearning bro Broly speaking the first half of the course will be about supervised learning the second half ofthe course will be mainly about unsupervised learning and reinforcement learning will not be covered in thecourse because we can't cover everything learning can be divided into three broad groups of
Types of learning task
algorithms in supervised learning you're trying to predict an output when givenan input Vector so it's fairly clear what the point of supervised learning is in reinforcement learning you'retrying to select actions or sequences of actions to maximize the rewards you getand the rewards may only occur occasionally in unsupervised learningyou're trying to discover a good internal representation of the input and we'll come later to what thatmight mean supervised learning itself comes in two different
Two types of supervised learning
flavors in regression the target output is a real number or whole Vector of realnumbers such as the price of a stock in six months time or the temperature at noontomorrow and the aim is to get as close as you can to the correct real number inclassification the target output is a class label the simplest case is the choicebetween one and zero between positive and negative cases but obviously we canhave multiple alternative labels as when we're classifying handwritten digits
How supervised learning typically works
supervised learning works by initially selecting a model class that is a wholeset of models that we're prepared to consider as candidates you can think of a model class as a function that takes an inputvector and some parameters and gives you an output why so a model class is simply a way ofmapping an input to an output using some numerical parameters w and then weadjust these numerical parameters to make the mapping fit the supervised trainingdata what we mean by fit is minimizing a discrepancy between the target output oneach training case and the actual output produced by a machine Learning System and an obvious measure of thatdiscrepancy if we're using real values as outputs is the squared differencebetween the output from our system Y and the correct output tand we put in that half so it cancels the two when we differentiate forclassification um you could use that measure but there's other more sensible measures which we'll come to later andthese more sensible measures typically work better as well in reinforcement
Reinforcement learning
learning the output's an action or sequence of actions and you have to decide on those actions based onoccasional rewards the goal in selecting each action is to to maximize the expectedsum of the future rewards and we typically use a discount Factor so that you don't have to looktoo far in the future we say that rewards far in the future don't count for as much as rewards that you getfairly quickly reinforcement learning is difficult it's difficult because therewards are typically delayed so it's hard to know exactly which action was the wrong one in a long sequence ofactions it's also difficult because a scal reward especially one that only occurs occasdoes not supply much information on which to base the changes in parameters so typically you can't learn millions ofparameters using reinforcement learning whereas supervised learning and unsupervised learning you can typicallyin reinforcement learning you're trying to learn dozens of parameters or maybe a thousand parameters but notmillions in this course we can't cover everything and so we're not going tocover reinforcement learning even though it's an important topic unsupervisedlearning is going to be covered in the second half of the course for about 40years the machine Learning Community basically ignored unsupervised learning except for one very limited form calledclustering in fact they used definitions of machine learning that excluded it so they defined machine learning in sometextbooks as mapping from inputs to outputs and many researchers thoughtthat clustering was the only form of unsupervised learningone reason for this is that it's hard to say what the aim of unsupervised learning is one major aim is to create aninternal representation of the input that is useful for subsequent supervised or reinforcement learning and the reasonwe might want to do that in two stages is we don't want to use for example the payoffs from reinforcement learning inorder to set the parameters for our visual system so you can comp the distance to aSurface by using the disparity between the images you get in your two eyes but you don't want to learn to dothat computation of Distance by repeatedly stubbing your toe and adjusting the parameters in your visualsystem every time you stub your toe that would involve stuffing your toe a very large number of times and there's muchbetter ways to learn to fuse two images based purely on the information in theinputs other goals for unsupervised learning are to provide compact low dimension representations of the
Other goals for unsupervised learning • It provides a compact, low-dimensional representation of the input.
input so high dimensional inputs like images typically live on or near a lowdimensional manifold or several such manifolds in the case of the handwrittendigits what that means is even if you have a million pixels there aren'treally a million degrees of freedom in what can happen there may only be a few hundred degrees of freedom in what canhappen so what we want to do is move from a million pixels to a representation of those few hundreddegrees of freedom which would be equivalent to saying where we are on a manifold also we need to know whichmanifold we're on a very limited form of this is principal components analysis which islinear it assumes that there's one manifold and the manifold is a plane in the high dimensionalspace another definition of unsupervised learning or another goal for unsupervised learning is to Pro toprovide an economical representation for the input in terms of learned featuresif for example we can represent the input in terms of binary features that's typically economical because it onlytakes one bit to say the state of a binary feature alternatively we could use a large number of real valuedfeatures but insist that for each input almost all of those features are exactlyzero in that case for each input we only need to represent a few real numbers and that'seconomical as I mentioned before another definition of unsupervised learning or another goal for unsupervised learningis to find clusters in the input and clustering could be viewed as a verysparse code that is we have one feature per cluster and we insist that all ofthe features except one or zero and that one feature has a value of one soclustering is really just an extreme case of finding sparse features.
Last updated