We’ve been using the standard chain rule for functions of
one variable throughout the last couple of sections. It’s now time to extend the chain rule out to
more complicated situations. Before we
actually do that let’s first review the notation for the chain rule for
functions of one variable.
The notation that’s probably familiar to most people is the
There is an alternate notation however that while probably
not used much in Calculus I is more convenient at this point because it will
match up with the notation that we are going to be using in this section. Here it is.
Notice that the derivative really does make sense here since if we were
to plug in for x then y really would be a function of t.
One way to remember this form of the chain rule is to note that if we
think of the two derivatives on the right side as fractions the dx’s will cancel to get the same
derivative on both sides.
Okay, now that we’ve got that out of the way let’s move into
the more complicated chain rules that we are liable to run across in this
As with many topics in multivariable calculus, there are in
fact many different formulas depending upon the number of variables that we’re
dealing with. So, let’s start this
discussion off with a function of two variables, . From this point there are still many
different possibilities that we can look at.
We will be looking at two distinct cases prior to generalizing the whole
Case 1 : ,
and compute .
This case is analogous to the standard chain rule from
Calculus I that we looked at above. In
this case we are going to compute an ordinary derivative since z really would be a function of t only if we were to substitute in for x and y.
The chain rule for this case is,
So, basically what we’re doing here is differentiating f with respect to each variable in it
and then multiplying each of these by the derivative of that variable with
respect to t. The final step is to then add all this up.
Let’s take a look at a couple of examples.
Example 1 Compute
for each of the following.
There really isn’t all that much to do here other than
using the formula.
So, technically we’ve computed the derivative. However, we should probably go ahead and
substitute in for x and y as well at this point since we’ve
already got t’s in the
derivative. Doing this gives,
Note that in this case it might actually have been easier
to just substitute in for x and y in the original function and just
compute the derivative as we normally would.
For comparison's sake let’s do that.
The same result for less work. Note however, that often it will actually
be more work to do the substitution first.
[Return to Problems]
Okay, in this case it would almost definitely be more work
to do the substitution first so we’ll use the chain rule first and then
Note that sometimes, because of the significant mess of
the final answer, we will only simplify the first step a little and leave the
answer in terms of x, y, and t. This is dependent upon
the situation, class and instructor however and for this class we will pretty
much always be substituting in for x
[Return to Problems]
Now, there is a special case that we should take a quick
look at before moving on to the next case.
Let’s suppose that we have the following situation,
In this case the chain rule for becomes,
In the first term we are using the fact that,
Let’s take a quick look at an example.
Now let’s take a look at the second case.
Case 2 : ,
and compute and .
In this case if we were to substitute in for x and y we would get that z is
a function of s and t and so it makes sense that we would be
computing partial derivatives here and that there would be two of them.
Here is the chain rule for both of these cases.
So, not surprisingly, these are very similar to the first
case that we looked at. Here is a quick
example of this kind of chain rule.
Okay, now that we’ve seen a couple of cases for the chain
rule let’s see the general version of the chain rule.
Suppose that z is
a function of n variables, ,
and that each of these variables are in turn functions of m variables, . Then for any variable ,
we have the following,
Wow. That’s a lot to
remember. There is actually an easier
way to construct all the chain rules that we’ve discussed in the section or
will look at in later examples. We can
build up a tree diagram that will
give us the chain rule for any situation.
To see how these work let’s go back and take a look at the chain rule
for given that ,
. We already know what this is, but it may help
to illustrate the tree diagram if we already know the answer. For reference here is the chain rule for this
Here is the tree diagram for this case.
We start at the top with the function itself and the branch
out from that point. The first set of
branches is for the variables in the function.
From each of these endpoints we put down a further set of branches that
gives the variables that both x and y are a function of. We connect each letter with a line and each
line represents a partial derivative as shown.
Note that the letter in the numerator of the partial derivative is the
upper “node” of the tree and the letter in the denominator of the partial
derivative is the lower “node” of the tree.
To use this to get the chain rule we start at the bottom and
for each branch that ends with the variable we want to take the derivative with
respect to (s in this case) we move
up the tree until we hit the top multiplying the derivatives that we see along
that set of branches. Once we’ve done
this for each branch that ends at s,
we then add the results up to get the chain rule for that given situation.
Note that we don’t usually put the derivatives in the
tree. They are always an assumed part of
Let’s write down some chain rules.
Example 4 Use
a tree diagram to write down the chain rule for the given derivatives.
(a) for ,
(b) for ,
(a) for ,
So, we’ll first need the tree diagram so let’s get that.
From this is looks like the chain rule for this case
which is really just a natural extension to the two
variable case that we saw above.
[Return to Problems]
(b) for ,
Here is the tree diagram for this situation.
From this it looks like the derivative will be,
[Return to Problems]
So, provided we can write down the tree diagram, and these
aren’t usually too bad to write down, we can do the chain rule for any set up
that we might run across.
We’ve now seen how to take first derivatives of these more
complicated situations, but what about higher order derivatives? How do we do those? It’s probably easiest to see how to deal with
these with an example.
The final topic in this section is a revisiting of implicit
differentiation. With these forms of the
chain rule implicit differentiation actually becomes a fairly simple
process. Let’s start out with the implicit differentiation that we saw in a
Calculus I course.
We will start with a function in the form (if it’s not in this form simply move
everything to one side of the equal sign to get it into this form) where . In a Calculus I course we were then asked to
compute and this was often a fairly messy
process. Using the chain rule from this
section however we can get a nice simple formula for doing this. We’ll start by differentiating both sides
with respect to x. This will mean using the chain rule on the
left side and the right side will, of course, differentiate to zero. Here are the results of that.
As shown, all we need to do next is solve for and we’ve now got a very nice formula to use
for implicit differentiation. Note as
well that in order to simplify the formula we switched back to using the
subscript notation for the derivatives.
Let’s check out a quick example.
Example 6 Find
The first step is to get a zero on one side of the equal
sign and that’s easy enough to do.
Now, the function on the left is in our formula so all we need to do is use
the formula to find the derivative.
There we go. It
would have taken much longer to do this using the old Calculus I way of doing
We can also do something similar to handle the types of
implicit differentiation problems involving partial derivatives like those we
saw when we first introduced partial derivatives. In these cases we will start off with a
function in the form and assume that and we want to find and/or .
Let’s start by trying to find . We will differentiate both sides with respect
to x and we’ll need to remember that
we’re going to be treating y as a
constant. Also, the left side will
require the chain rule. Here is this
Now, we have the following,
The first is because we are just differentiating x with respect to x and we know that is 1. The
second is because we are treating the y
as a constant and so it will differentiate to zero.
Plugging these in and solving for gives,
A similar argument can be used to show that,
As with the one variable case we switched to the
subscripting notation for derivatives to simplify the formulas. Let’s take a quick look at an example of
Example 7 Find
and for .
This was one of the functions that we used the old
implicit differentiation on back in the Partial Derivatives
section. You might want to go back and
see the difference between the two.
First let’s get everything on one side.
Now, the function on the left is and so all that we need to do is use the
formulas developed above to find the derivatives.
If you go back and compare these answers to those that we
found the first time around you will notice that they might appear to be
different. However, if you take into
account the minus sign that sits in the front of our answers here you will
see that they are in fact the same.