Paul's Online Math Notes
[Notes]
Calculus III - Notes
 3-Dimensional Space Previous Chapter Next Chapter Applications of Partial Derivatives Chain Rule Previous Section

## Directional Derivatives

To this point we’ve only looked at the two partial derivatives  and .  Recall that these derivatives represent the rate of change of f as we vary x (holding y fixed) and as we vary y (holding x fixed) respectively.  We now need to discuss how to find the rate of change of f if we allow both x and y to change simultaneously.  The problem here is that there are many ways to allow both x and y to change.  For instance one could be changing faster than the other and then there is also the issue of whether or not each is increasing or decreasing.  So, before we get into finding the rate of change we need to get a couple of preliminary ideas taken care of first.  The main idea that we need to look at is just how are we going to define the changing of x and/or y

Let’s start off by supposing that we wanted the rate of change of f at a particular point, say .  Let’s also suppose that both x and y are increasing and that, in this case, x is increasing twice as fast as y is increasing.  So, as y increases one unit of measure x will increase two units of measure.

To help us see how we’re going to define this change let’s suppose that a particle is sitting at  and the particle will move in the direction given by the changing x and y.  Therefore, the particle will move off in a direction of increasing x and y and the x coordinate of the point will increase twice as fast as the y coordinate.  Now that we’re thinking of this changing x and y as a direction of movement we can get a way of defining the change.  We know from Calculus II that vectors can be used to define a direction and so the particle, at this point, can be said to be moving in the direction,

Since this vector can be used to define how a particle at a point is changing we can also use it describe how x and/or y is changing at a point.  For our example we will say that we want the rate of change of f in the direction of .  In this way we will know that x is increasing twice as fast as y is.  There is still a small problem with this however.  There are many vectors that point in the same direction.  For instance all of the following vectors point in the same direction as .

We need a way to consistently find the rate of change of a function in a given direction.  We will do this by insisting that the vector that defines the direction of change be a unit vector.  Recall that a unit vector is a vector with length, or magnitude, of 1.  This means that for the example that we started off thinking about we would want to use

since this is the unit vector that points in the direction of change.

For reference purposes recall that the magnitude or length of the vector  is given by,

For two dimensional vectors we drop the c from the formula.

Sometimes we will give the direction of changing x and y as an angle.  For instance, we may say that we want the rate of change of f in the direction of .  The unit vector that points in this direction is given by,

Okay, now that we know how to define the direction of changing x and y its time to start talking about finding the rate of change of f in this direction.  Let’s start off with the official definition.

Definition

 The rate of change of  in the direction of the unit vector  is called the directional derivative and is denoted by .  The definition of the directional derivative is,

So, the definition of the directional derivative is very similar to the definition of partial derivatives.  However, in practice this can be a very difficult limit to compute so we need an easier way of taking directional derivatives.  It’s actually fairly simple to derive an equivalent formula for taking directional derivatives.

To see how we can do this let’s define a new function of a single variable,

where , , a, and b are some fixed numbers.  Note that this really is a function of a single variable now since z is the only letter that is not representing a fixed number.

Then by the definition of the derivative for functions of a single variable we have,

and the derivative at  is given by,

If we now substitute in for  we get,

So, it looks like we have the following relationship.

 (1)

Now, let’s look at this from another perspective.  Let’s rewrite  as follows,

We can now use the chain rule from the previous section to compute,

So, from the chain rule we get the following relationship.

 (2)

If we now take  we will get that  and  (from how we defined x and y above) and plug these into (2) we get,

 (3)

Now, simply equate (1) and (3) to get that,

If we now go back to allowing x and y to be any number we get the following formula for computing directional derivatives.

This is much simpler than the limit definition.  Also note that this definition assumed that we were working with functions of two variables.  There are similar formulas that can be derived by the same type of argument for functions with more than two variables.  For instance, the directional derivative of  in the direction of the unit vector  is given by,

Let’s work a couple of examples.

 Example 1  Find each of the directional derivatives. (a)  where  and  is the unit vector in the direction                   of .   [Solution] (b)  where  in the direction of        .   [Solution]         Solution (a)  where  and  is the unit vector in the direction of .   We’ll first find  and then use this a formula for finding .  The unit vector giving the direction is,                                              So, the directional derivative is,                                     Now, plugging in the point in question gives,                                           (b)  where  in the direction of .   In this case let’s first check to see if the direction vector is a unit vector or not and if it isn’t convert it into one.  To do this all we need to do is compute its magnitude.                                                       So, it’s not a unit vector.  Recall that we can convert any vector into a unit vector that points in the same direction by dividing the vector by its magnitude.  So, the unit vector that we need is,                                                The directional derivative is then,

There is another form of the formula that we used to get the directional derivative that is a little nicer and somewhat more compact.  It is also a much more general formula that will encompass both of the formulas above.

Let’s start with the second one and notice that we can write it as follows,

In other words we can write the directional derivative as a dot product and notice that the second vector is nothing more than the unit vector  that gives the direction of change.  Also, if we had used the version for functions of two variables the third component wouldn’t be there, but other than that the formula would be the same.

Now let’s give a name and notation to the first vector in the dot product since this vector will show up fairly regularly throughout this course (and in other courses).  The gradient of f or gradient vector of f is defined to be,

Or, if we want to use the standard basis vectors the gradient is,

The definition is only shown for functions of two or three variables, however there is a natural extension to functions of any number of variables that we’d like.

With the definition of the gradient we can now say that the directional derivative is given by,

where we will no longer show the variable and use this formula for any number of variables.  Note as well that we will sometimes use the following notation,

where  or  as needed.  This notation will be used when we want to note the variables in some way, but don’t really want to restrict ourselves to a particular number of variables.  In other words,  will be used to represent as many variables as we need in the formula and we will most often use this notation when we are already using vectors or vector notation in the problem/formula.

Let’s work a couple of examples using this formula of the directional derivative.

 Example 2  Find each of the directional derivative. (a)  for  in the direction of .   [Solution] (b)  for  at  in the direction of        .   [Solution] Solution (a)  for  in the direction of .   Let’s first compute the gradient for this function.                                                       Also, as we saw earlier in this section the unit vector for this direction is,                                                                  The directional derivative is then,                                            (b)  for  at  in the direction of .   In this case are asking for the directional derivative at a particular point.  To do this we will first compute the gradient, evaluate it at the point in question and then do the dot product.  So, let’s get the gradient.                                       Next, we need the unit vector for the direction,                                           Finally, the directional derivative at the point in question is,

Before proceeding let’s note that the first order partial derivatives that we were looking at in the majority of the section can be thought of as special cases of the directional derivatives.  For instance,  can be thought of as the directional derivative of f in the direction of  or , depending on the number of variables that we’re working with.  The same can be done for  and

We will close out this section with a couple of nice facts about the gradient vector.  The first tells us how to determine the maximum rate of change of a function at a point and the direction that we need to move in order to achieve that maximum rate of change.

Theorem

 The maximum value of  (and hence then the maximum rate of change of the function  ) is given by   and will occur in the direction given by .

Proof

 This is a really simple proof.  First, if we start with the dot product form  and use a nice fact about dot products as well as the fact that  is a unit vector we get,                                           where  is the angle between the gradient and .   Now the largest possible value of  is 1 which occurs at .  Therefore the maximum value of  is    Also, the maximum value occurs when the angle between the gradient and  is zero, or in other words when  is pointing in the same direction as the gradient, .

Let’s take a quick look at an example.

 Example 3  Suppose that the height of a hill above sea level is given by .  If you are at the point  in what direction is the elevation changing fastest?  What is the maximum rate of change of the elevation at this point?    Solution First, you will hopefully recall from the Quadric Surfaces section that this is an elliptic paraboloid that opens downward.  So even though most hills aren’t this symmetrical it will at least be vaguely hill shaped and so the question makes at least a little sense.   Now on to the problem.  There are a couple of questions to answer here, but using the theorem makes answering them very simple.  We’ll first need the gradient vector.                                                        The maximum rate of change of the elevation will then occur in the direction of                                                          The maximum rate of change of the elevation at this point is,                                       Before leaving this example let’s note that we’re at the point  and the direction of greatest rate of change of the elevation at this point is given by the vector .  Since both of the components are negative it looks like the direction of maximum rate of change points up the hill towards the center rather than away from the hill.

The second fact about the gradient vector that we need to give in this section will be very convenient in some later sections.

Fact

 The gradient vector  is orthogonal (or perpendicular) to the level curve  at the point .  Likewise, the gradient vector  is orthogonal to the level surface  at the point .

Proof

 We’re going to do the proof for the  case.  The proof for the  case is identical.  We’ll also need some notation out of the way to make life easier for us let’s let S be the level surface given by  and let .  Note as well that P will be on S.   Now, let C be any curve on S that contains P.  Let  be the vector equation for C and suppose that  be the value of t such that .  In other words,  be the value of t that gives P.   Because C lies on S we know that points on C must satisfy the equation for S.  Or,                                                           Next, let’s use the Chain Rule on this to get,                                                                                                   (4)   Notice that  and  so (4) becomes,                                                                   At,  this is,                                                          This then tells us that the gradient vector at P , , is orthogonal to the tangent vector, , to any curve C that passes through P and on the surface S and so must also be orthogonal to the surface S.

As we will be seeing in later sections we are often going to be needing vectors that are orthogonal to a surface or curve and using this fact we will know that all we need to do is compute a gradient vector and we will get the orthogonal vector that we need.  We will see the first application of this in the next chapter.

 Chain Rule Previous Section 3-Dimensional Space Previous Chapter Next Chapter Applications of Partial Derivatives

[Notes]

 © 2003 - 2018 Paul Dawkins