In the previous section we introduced the idea of inverse
matrices and elementary matrices. In
this section we need to devise a method for actually finding the inverse of a
matrix and as we’ll see this method will, in some way, involve elementary
matrices, or at least the row operations that they represent.
The first thing that we’ll need to do is take care of a
couple of theorems.
|
Theorem 1 If A is an  matrix then the following statements are
equivalent.
(a) A is invertible.
(b) The only solution to the
system  is the trivial solution.
(c) A is row equivalent to  .
(d) A is expressible as a product of elementary matrices.
|
Before we get into the proof let’s say a couple of words
about just what this theorem tells us and how we go about proving something
like this. First, when we have a set of
statements and when we say that they are equivalent then what we’re really
saying is that either they are all true or they are all false. In other words, if you know one of these
statements is true about a matrix A
then they are all true for that matrix.
Likewise, if one of these statements is false for a matrix A then they are all false for that
matrix.
To prove a set of equivalent statements we need to prove a
string of implications. This string has
to be able to get from any one statement to any other through a finite number
of steps. In this case we’ll prove the
following chain 
. By doing this if we know one of them to be
true/false then we can follow this chain to get to any of they others.
The actual proof will involve four parts, one for each
implication. To prove a given
implication we’ll assume the statement on the left is true and show that this
must in some way also force the statement on the right to also be true. So, let’s get going.
Proof :

:
So we’ll assume that A is invertible
and we need to show that this assumption also implies that 
will have only the trivial solution. That’s actually pretty easy to do. Since A
is invertible we know that 
exists.
So, start by assuming that 
is any solution to the system, plug this into
the system and then multiply (on the left) both sides by 
to get,

So, 
has only the trivial solution and we’ve
managed to prove this implication.

:
Here we’re assuming that 
will have only the trivial solution and we’ll
need to show that A is row equivalent
to 
. Recall
that two matrices are row equivalent if we can get from one to the other by
applying a finite set of elementary row operations.
Let’s start off by writing down the augmented matrix for
this system.
Now, if we were going to solve this we would use elementary
row operations to reduce this to reduced row-echelon form, Now we know that the solution to this system
must be,

by assumption.
Therefore, we also know what the reduced row-echelon form of the
augmented matrix must be since that must give the above solution. The reduced-row echelon form of this
augmented matrix must be,
Now, the entries in the last column do not affect the values
in the entries in the first n columns
and so if we take the same set of elementary row operations and apply them to A we will get 
and so A
is row equivalent to 
since we can get to 
by applying a finite set of row operations to A.
Therefore this implication has been proven.

: In this case we’re going to assume that A is row
equivalent to 
and we’ll need to show that A can be written as a product of
elementary matrices.
So, since A is row
equivalent to 
we know there is a finite set of elementary
row operations that we can apply to A
that will give us 
. Let’s suppose that these row operations are
represented by the elementary matrices 
. Then by Theorem 4 of the previous
section we know that applying each row operation to A is the same thing as multiplying the left side of A by each of the corresponding
elementary matrices in the same order.
So, we then know that we will have the following.
Now, by Theorem
5 from the previous section we know that each of these elementary matrices
is invertible and their inverses are also elementary matrices. So multiply the above equation (on the left)
by 
(in that order) to get,
So, we see that A
is a product of elementary matrices and this implication is proven.

: Here
we’ll be assuming that A is a product
of elementary matrices and we need to show that A is invertible. This is
probably the easiest implication to prove.
First, A is a
product of elementary matrices. Now, by Theorem 5 from the previous
section we know each of these elementary matrices is invertible and by Theorem 2(a) also from the
previous section we know that a product of invertible matrices is also
invertible. Therefore, A is invertible since it can be written
as a product of invertible matrices and we’ve proven this implication.

This theorem can actually be extended to include a couple
more equivalent statements, but to do that we need another theorem.
Proof :
(a) This proof
will need part (b) of Theorem 1. If we
can show that 
has only the trivial solution then by Theorem
1 we will know that A is
invertible. So, let 
be any solution to 
. Plug this into the equation and then multiply
both sides (on the left by B.
So, this shows that any solution to 
must be the trivial solution and so by Theorem
1 if one statement is true they all are and so A is invertible. We know
from the previous section that inverses are unique and because 
we must then also have 
.
(b) In this case
let’s let 
be any solution to 
. Then multiplying both sides (on the left) of
this by A we can use a similar
argument to that used in (a) to show
that 
must be the trivial solution and so B is an invertible matrix and that in
fact 
. Now, this isn’t quite what we were asked to
prove, but it does in fact give us the proof.
Because B is invertible and
its inverse is A (by the above work)
we know that,
but this is exactly what it means for A to be invertible and that 
. So, we are done.

So, what’s the big deal with this theorem? We’ll recall in the last section that in
order to show that a matrix, B, was
the inverse of A we needed to show
that 
. In other words, we needed to show that both
of these products were the identity matrix.
Theorem 2 tells us that all we really need to do is show one of them and
we get the other one for free.
This theorem gives us is the ability to add two equivalent
statements to Theorem 1. Here is the
improved Theorem 1.
Note that (e) and
(f) appear to be the same on the
surface, but recall that consistent only says that there is at least one
solution. If a system is consistent
there may be infinitely many solutions.
What this part is telling us is that if the system is consistent for any
choice of b that we choose to put
into the system then we will in fact only get a single solution. If even one b gives infinitely many solutions then (f) is false, which in turn makes all the other statements false.
Okay so how do we go about proving this? We’ve already proved that the first four
statements are equivalent above so there’s no reason to redo that work. This means that all we need to do is prove
that one of the original statements implies the new two new statements and
these in turn imply one of the four original statements. We’ll do this by proving the following
implications 
.
Proof :

: Okay with this implication we’ll assume that
A is invertible and we’ll need to
show that 
has exactly one solution for every 
matrix b. This is actually very simple to do. Since A
is invertible we know that 
so we’ll do the following.
So, if A is
invertible we’ve shown that the solution to the system will be 
and since matrix multiplication is unique (i.e. we aren’t going to get two
different answers from the multiplication) the solution must also be unique and
so there is exactly one solution to the system.

: This implication is trivial. We’ll start off by assuming that the system 
has exactly one solution for every 
matrix b
but that also means that the system is consistent for every 
matrix b
and so we’re done with the proof of this implication.

: Here we’ll start off by assuming that 
is consistent for every 
matrix b
and we’ll need to show that this implies A
is invertible. So, if 
is consistent for every 
matrix b
it is consistent for the following n
systems.
Since we know each of these systems have solutions let 
be those solutions and form a new matrix, B, with these solutions as its
columns. In other words,
Now let’s take a look at the product AB. We know from the matrix arithmetic section that the ith column of AB will be given by 
and we know what each of these products will
be since 
is a solution to one of the systems
above. So, let’s use all this knowledge
to see what the product AB is.
So, we’ve shown that 
,
but by Theorem 2 this means that A
must be invertible and so we’re done with the proof.

Before proceeding let’s notice that part (c) of this theorem
is also telling us that if we reduced A
down to reduced row-echelon form then we’d have 
. This can also be seen in the proof in Theorem
1 of the implication 
.
So, just how does this theorem help us to determine the
inverse of a matrix? Well, first let’s
assume that A is in fact invertible
and so all the statements in Theorem 3 are true. Now, go back to the proof of the implication 
. In this proof we saw that there were
elementary matrices, 
,
so that we’d get the following,
Since we know A is
invertible we know that 
exists and so multiply (on the right) each
side of this to get,
What this tell us is that we need to find a series of row
operation that will reduce A to 
and then apply the same set of operations to 
and the result will be the inverse, 
.
Okay, all this is fine.
We can write down a bunch of symbols to tell us how to find the inverse,
but that doesn’t always help to actually find the inverse. The work above tells us that we need to
identify a series of elementary row operations that will reduce A to 
and then apply those operations to 
. Well it turns out that we can do both of
these steps simultaneously and we don’t need to mess around with the elementary
matrices.
Let’s start off by supposing that A is an invertible 
matrix and then form the following new matrix.
Note that all we did here was tack on 
to the original matrix A. Now, if we apply a row
operation to this it will be equivalent to applying it simultaneously to both A and to 
. So, all we need to do is find a series of row
operations that will reduce the “A”
portion of this to 
,
making sure to apply the operations to the whole matrix. Once we’ve done this we will have,
provided A is in
fact invertible of course. We’ll deal
with singular matrices in a bit.
Let’s take a look at a couple of examples.
|
Example 2 Determine
the inverse of the following matrix given that it is invertible.

Solution
Okay we’ll first form the new matrix,

and we’ll use elementary row operations to reduce the
first three columns to  and then the last three columns will be the
inverse of C. Here is that work.




So, we’ve gotten the first three columns reduced to  and that means the last three must be the
inverse.

We’ll leave it to you to verify that  .
|
Okay, so far we’ve seen how to use the method above to
determine an inverse, but what happens if a matrix doesn’t have an
inverse? Well it turns out that we can
also use this method to determine that as well and it generally doesn’t take
quite as much work as it does to actually find the inverse (if it exists of
course….).
Let’s take a look at an example of that.
|
Example 3 Show
that the following matrix does not have an inverse, i.e. show the matrix is singular.

Solution
Okay, the problem statement says that the matrix is
singular, but let’s pretend that we didn’t know that and work the problem as
we did in the previous two examples.
That means we’ll need the new matrix,

Now, let’s get started on getting the first three columns
reduced to  .


At this point let’s stop and examine the third row in a
little more detail. In order for the
first three columns to be  the first three entries of the last row MUST
be  which we clearly don’t have. We could use a multiple of row 1 or row 2
to get a 1 in the third spot, but that would in turn change at least one of
the first two entries away from 0.
That’s a problem since they must remain zeroes.
In other words, there is no way to make the third entry in
the third row a 1 without also changing one or both of the first two entries
into something other than zero and so we will never be able to make the first
three columns into  .
So, there are no sets of row operations that will reduce B to  and hence B is NOT row equivalent to  . Now, go back to Theorem 3. This was a set of equivalent statements and
if one is false they are all false.
We’ve just managed to show that part (c) is false and that means that
part (a) must also be false.
Therefore, B must be a
singular matrix.
|




















The idea used in this last example to show that B was singular can be used in
general. If, in the course of reducing
the new matrix, we end up with a row in which all the entries to the left
of the dashed line are zeroes we will know that the matrix must be singular.
We’ll leave this section off with a quick formula that can
always be used to find the inverse of an invertible 
matrix as well as a way to quickly determine
if the matrix is invertible. The above
method is nice in that it always works, but it can be cumbersome to use so the
following formula can help to make things go quicker for 
matrices.
Let’s do a quick example or two of this fact.
|
Example 4 Use
the fact to show that

is an invertible matrix and find its inverse.
Solution
We’ve already looked at this one above, but let’s do it
here so we can contrast the work between the two methods. First, we need,

So, the matrix is in fact invertible by the fact and here
is the inverse,

|
|
Example 5 Determine
if the following matrix is singular.

Solution
Not much to do with this one.

So, by the fact the matrix is singular.
|
If you’d like to see a couple more example of finding
inverses check out the section on Special
Matrices, there are a couple more examples there.