Paul's Online Math Notes
Online Notes / Calculus III (Notes) / Partial Derivatives / Chain Rule

Internet Explorer 10 & 11 Users : If you are using Internet Explorer 10 or Internet Explorer 11 then, in all likelihood, the equations on the pages are all shifted downward. To fix this you need to put your browser in Compatibility View for my site. Click here for instructions on how to do that. Alternatively, you can also view the pages in Chrome or Firefox as they should display properly in the latest versions of those browsers without any additional steps on your part.

Differentials Calculus III - Notes Directional Derivatives

 Chain Rule

We’ve been using the standard chain rule for functions of one variable throughout the last couple of sections.  It’s now time to extend the chain rule out to more complicated situations.  Before we actually do that let’s first review the notation for the chain rule for functions of one variable.


The notation that’s probably familiar to most people is the following.




There is an alternate notation however that while probably not used much in Calculus I is more convenient at this point because it will match up with the notation that we are going to be using in this section.  Here it is.




Notice that the derivative  really does make sense here since if we were to plug in for x then y really would be a function of t.  One way to remember this form of the chain rule is to note that if we think of the two derivatives on the right side as fractions the dx’s will cancel to get the same derivative on both sides.


Okay, now that we’ve got that out of the way let’s move into the more complicated chain rules that we are liable to run across in this course.


As with many topics in multivariable calculus, there are in fact many different formulas depending upon the number of variables that we’re dealing with.  So, let’s start this discussion off with a function of two variables, .  From this point there are still many different possibilities that we can look at.  We will be looking at two distinct cases prior to generalizing the whole idea out.


Case 1 : , ,  and compute .

This case is analogous to the standard chain rule from Calculus I that we looked at above.  In this case we are going to compute an ordinary derivative since z really would be a function of t only if we were to substitute in for x and y.


The chain rule for this case is,




So, basically what we’re doing here is differentiating f with respect to each variable in it and then multiplying each of these by the derivative of that variable with respect to t.  The final step is to then add all this up.


Let’s take a look at a couple of examples.


Example 1  Compute  for each of the following.

(a) , ,    [Solution]

(b) , ,    [Solution]



(a) , ,  


There really isn’t all that much to do here other than using the formula.


So, technically we’ve computed the derivative.  However, we should probably go ahead and substitute in for x and y as well at this point since we’ve already got t’s in the derivative.  Doing this gives,




Note that in this case it might actually have been easier to just substitute in for x and y in the original function and just compute the derivative as we normally would.  For comparison's sake let’s do that.



The same result for less work.  Note however, that often it will actually be more work to do the substitution first.

[Return to Problems]

(b) , ,  


Okay, in this case it would almost definitely be more work to do the substitution first so we’ll use the chain rule first and then substitute.



Note that sometimes, because of the significant mess of the final answer, we will only simplify the first step a little and leave the answer in terms of x, y, and t.  This is dependent upon the situation, class and instructor however and for this class we will pretty much always be substituting in for x and y.

[Return to Problems]


Now, there is a special case that we should take a quick look at before moving on to the next case.  Let’s suppose that we have the following situation,




In this case the chain rule for  becomes,




In the first term we are using the fact that,



Let’s take a quick look at an example.


Example 2  Compute  for ,  


We’ll just plug into the formula.



Now let’s take a look at the second case.


Case 2 : , ,  and compute  and .


In this case if we were to substitute in for x and y we would get that z is a function of s and t and so it makes sense that we would be computing partial derivatives here and that there would be two of them.


Here is the chain rule for both of these cases.





So, not surprisingly, these are very similar to the first case that we looked at.  Here is a quick example of this kind of chain rule.


Example 3  Find  and  for , .


Here is the chain rule for .



Now the chain rule for .



Okay, now that we’ve seen a couple of cases for the chain rule let’s see the general version of the chain rule.


Chain Rule

Suppose that z is a function of n variables, , and that each of these variables are in turn functions of m variables, .  Then for any variable ,  we have the following,




Wow.  That’s a lot to remember.  There is actually an easier way to construct all the chain rules that we’ve discussed in the section or will look at in later examples.  We can build up a tree diagram that will give us the chain rule for any situation.  To see how these work let’s go back and take a look at the chain rule for  given that , , .  We already know what this is, but it may help to illustrate the tree diagram if we already know the answer.  For reference here is the chain rule for this case,




Here is the tree diagram for this case.



We start at the top with the function itself and the branch out from that point.  The first set of branches is for the variables in the function.  From each of these endpoints we put down a further set of branches that gives the variables that both x and y are a function of.  We connect each letter with a line and each line represents a partial derivative as shown.  Note that the letter in the numerator of the partial derivative is the upper “node” of the tree and the letter in the denominator of the partial derivative is the lower “node” of the tree.


To use this to get the chain rule we start at the bottom and for each branch that ends with the variable we want to take the derivative with respect to (s in this case) we move up the tree until we hit the top multiplying the derivatives that we see along that set of branches.  Once we’ve done this for each branch that ends at s, we then add the results up to get the chain rule for that given situation.


Note that we don’t usually put the derivatives in the tree.  They are always an assumed part of the tree.


Let’s write down some chain rules.


Example 4  Use a tree diagram to write down the chain rule for the given derivatives.

(a)  for , , , and    [Solution]

(b)  for , , , and    




(a)  for , , , and  


So, we’ll first need the tree diagram so let’s get that.




From this is looks like the chain rule for this case should be,


which is really just a natural extension to the two variable case that we saw above.

[Return to Problems]


(b)  for , , , and  


Here is the tree diagram for this situation.



From this it looks like the derivative will be,


[Return to Problems]


So, provided we can write down the tree diagram, and these aren’t usually too bad to write down, we can do the chain rule for any set up that we might run across.


We’ve now seen how to take first derivatives of these more complicated situations, but what about higher order derivatives?  How do we do those?  It’s probably easiest to see how to deal with these with an example.


Example 5  Compute  for  if  and .


We will need the first derivative before we can even think about finding the second derivative so let’s get that.  This situation falls into the second case that we looked at above so we don’t need a new tree diagram.  Here is the first derivative.



Okay, now we know that the second derivative is,



The issue here is to correctly deal with this derivative.  Since the two first order derivatives,  and , are both functions of x and y which are in turn functions of r and  both of these terms are products.  So, the using the product rule gives the following,



We now need to determine what  and  will be.  These are both chain rule problems again since both of the derivatives are functions of x and y and we want to take the derivative with respect to .


Before we do these let’s rewrite the first chain rule that we did above a little.




Note that all we’ve done is change the notation for the derivative a little.  With the first chain rule written in this way we can think of (1) as a formula for differentiating any function of x and y with respect to  provided we have  and


This however is exactly what we need to do the two new derivatives we need above.  Both of the first order partial derivatives,  and , are functions of x and y and  and  so we can use (1) to compute these derivatives. 


To do this we’ll simply replace all the f ’s in (1) with the first order partial derivative that we want to differentiate.  At that point all we need to do is a little notational work and we’ll get the formula that we’re after.


Here is the use of (1) to compute .



Here is the computation for .




The final step is to plug these back into the second derivative and do some simplifying.



It’s long and fairly messy but there it is.


The final topic in this section is a revisiting of implicit differentiation.  With these forms of the chain rule implicit differentiation actually becomes a fairly simple process.  Let’s start out with the implicit differentiation that we saw in a Calculus I course.


We will start with a function in the form  (if it’s not in this form simply move everything to one side of the equal sign to get it into this form) where .  In a Calculus I course we were then asked to compute  and this was often a fairly messy process.  Using the chain rule from this section however we can get a nice simple formula for doing this.  We’ll start by differentiating both sides with respect to x.  This will mean using the chain rule on the left side and the right side will, of course, differentiate to zero.  Here are the results of that.




As shown, all we need to do next is solve for  and we’ve now got a very nice formula to use for implicit differentiation.  Note as well that in order to simplify the formula we switched back to using the subscript notation for the derivatives.


Let’s check out a quick example.


Example 6  Find  for .



The first step is to get a zero on one side of the equal sign and that’s easy enough to do.


Now, the function on the left is  in our formula so all we need to do is use the formula to find the derivative.



There we go.  It would have taken much longer to do this using the old Calculus I way of doing this.


We can also do something similar to handle the types of implicit differentiation problems involving partial derivatives like those we saw when we first introduced partial derivatives.  In these cases we will start off with a function in the form  and assume that  and we want to find  and/or .


Let’s start by trying to find .  We will differentiate both sides with respect to x and we’ll need to remember that we’re going to be treating y as a constant.  Also, the left side will require the chain rule.  Here is this derivative.




Now, we have the following,



The first is because we are just differentiating x with respect to x and we know that is 1.  The second is because we are treating the y as a constant and so it will differentiate to zero.


Plugging these in and solving for  gives,




A similar argument can be used to show that,




As with the one variable case we switched to the subscripting notation for derivatives to simplify the formulas.  Let’s take a quick look at an example of this.


Example 7  Find  and  for .


This was one of the functions that we used the old implicit differentiation on back in the Partial Derivatives section.  You might want to go back and see the difference between the two.


First let’s get everything on one side.


Now, the function on the left is  and so all that we need to do is use the formulas developed above to find the derivatives.




If you go back and compare these answers to those that we found the first time around you will notice that they might appear to be different.  However, if you take into account the minus sign that sits in the front of our answers here you will see that they are in fact the same.

Differentials Calculus III - Notes Directional Derivatives

Online Notes / Calculus III (Notes) / Partial Derivatives / Chain Rule

[Contact Me] [Links] [Privacy Statement] [Site Map] [Terms of Use] [Menus by Milonic]

© 2003 - 2014 Paul Dawkins