In this section we’re going to take a look at a special kind
of matrix. We’ll start out with the
following definition.
|
Definition 1 Suppose
that A is a square matrix and
further suppose that there exists an invertible matrix P (of the same size as A
of course) such that  is a diagonal matrix. In such a case we call A diagonalizable and
say that P diagonalizes A.
|
The following theorem will not only tell us when a matrix is
diagonalizable, but the proof will tell us how to construct P when A is diagonalizable.
|
Theorem 1 Suppose
that A is an  matrix, then the following are equivalent.
(a) A is diagonalizable.
(b) A has n linearly independent eigenvectors.
|
Proof : We’ll
start by proving that 
. So, assume that A is diagonalizable and so we know that an invertible matrix P exists so that 
is a diagonal matrix. Now, let 
,

,
… , 
be the columns of P and suppose that D is
the diagonal matrix we get from 
,
i.e. 
. So, both P
and D have the following forms,
Also note that because P
is an invertible matrix Theorem
8 from the Fundamental Subspaces section tells us that the columns of P will form a basis for 
and hence must be linearly independent. Therefore, 
,

,
… , 
are a set of linearly independent column vectors.
Now, if we rewrite 
we arrive at 
or,
Theorem 1
from the Matrix Arithmetic section tell us that the jth column of PD
is P[jth column of D]
and so the jth column of PD is nothing more than 
. The same theorem tells us that jth column of AP is A[jth column
of P] or 
.
Now, since we have 
the columns of both sides must be equal and so
we must have,

So, the diagonal entries from D, 
,

,
… , 
are the eigenvalues of A and their corresponding eigenvectors are the columns of P, 
,

,
… , 
. Also as we noted above these are a set of
linearly independent vectors which is what we were asked to prove.
We now need to prove 
and we’ve done most of the work for this in
the previous part. Let’s start by
assuming that the eigenvalues of A
are 
,

,
… , 
and that their associated eigenvectors 
,

,
… , 
are linearly independent.
Now, form a matrix P
whose columns are 
,

,
… , 
. So, P
has the form,
Now, as we noted above the columns of AP are given by
However, using the fact that 
,

,
… , 
are the eigenvectors of A each of these columns can be written as,
Therefore, AP can
be written as,
However, as we saw above, the matrix on the right can be
written as PD where D is the following diagonal matrix,
So, we’ve managed to show that by defining P as above we have 
. Finally, since the columns of P are n linearly independent vectors in 
we know
that they will form a basis for 
and so by Theorem 8 from the
Fundamental Subspaces section we know that P
must be invertible and hence we have,
where P is an
invertible matrix. Therefore A is diagonalizable.

Let’s take a look at a couple of examples.
|
Example 1 Find
a matrix P that will diagonalize
each of the following matrices.
(a) 
(b) 
Solution
Okay, provided we can find 3 linearly independent
eigenvectors for each of these we’ll have a pretty easy time of this since we
know that that the columns of P
will then be these three eigenvectors.
Nicely enough for us, we did exactly this in the Example 6 of the previous
section. At the time it probably
seemed like there was no reason for writing down specific eigenvectors for
each eigenvalue, but we did it for the problems in this section. So, in each case we’ll just go back to
Example 6 and pull the eigenvectors from that example and form up P.
(a) This was
part (a) from Example 6 and so P
is,

We’ll leave it
to you to verify that we get,

(b) This was
part (c) from Example 6 so P is,

Again, we’ll leave it to you to verify that,

|












|
Example 2 Neither
of the following matrices are diagonalizable.
(a) 
(b) 
Solution
To see that neither of these are diagonalizable simply go
back to Example 6 in the
previous section to see that neither matrix has 3 linearly independent
eigenvectors. In both cases we have
only two linearly independent eigenvectors and so neither matrix is
diagonalizable.
For reference purposes.
Part (a) of this example matches part(b) of Example 6 and part (b) of
this example matches part (d) of Example 6.
|




We didn’t actually do any of the work here for these
problems so let’s summarize up how we need to go about finding P, provided it exists of course. We first find the eigenvalues for the matrix A and then for each eigenvalue find a
basis for the eigenspace corresponding to that eigenvalue. The set of basis vectors will then serve as a
set of linearly independent eigenvectors for the eigenvalue. If, after we’ve done this work for all the
eigenvalues we have a set of n
eigenvectors then A is diagonalizable
and we use the eigenvectors to form P. If we don’t have a set of n eigenvectors then A is not diagonalizable.
Actually, we should be careful here. In the above statement we assumed that if we
had n eigenvectors that they would be
linearly independent. We should always
verify this of course. There is also one
case where we can guarantee that we’ll have n
linearly independent eigenvectors.
Proof : We’ll
prove this by assuming that 
,

,
… , 
are in fact linearly dependent and from this
we’ll get a contradiction and we’ll see that 
,

,
… , 
must be linearly independent.
So, assume that 
,

,
… , 
form a linearly dependent set. Now, since these are eigenvectors we know
that they are all non-zero vectors. This
means that the set 
must be a linearly independent set. So, we know that there must be a linearly
independent subset of 
,

,
… , 
. So, let p
be the largest integer such that 
is a linearly independent set. Note that we must have 
because we are assuming that 
,

,
… , 
are linearly dependent. Therefore we know that if we take the next
vector 
and add it to our linearly independent
vectors, 
,
the set 
will be a linearly dependent set.
So, if we know that 
is a linearly dependent set we know that there
are scalars 
,
not all zero so that,
Now, multiply this by A
to get,
We know that the 
are eigenvectors of A corresponding to the eigenvalues 
and so we know that 
. Using this gives us,
Next, multiply both sides of (1) by 
to get,
and subtract this from (2). Doing this gives,
Now, recall that we assumed that 
were a linearly independent set and so the
coefficients here must all be zero. Or,
However the eigenvalues are distinct and so the only way all
these can be zero is if,
Plugging these values into (1)
gives us
but, 
is an eigenvector and hence is not the zero
vector and so we must have
So, what have we shown to this point? Well we’ve just seen that the only possible
solution to

is
This however would mean that the set 
is linearly independent and we assumed that at
least some of the scalars were not zero.
Therefore, this contradicts the fact that we assumed that this set was
linearly dependent. Therefore our
original assumption that 
,

,
… , 
form a linearly dependent set must be wrong.
We can then see that 
,

,
… , 
form a linearly independent set.

We can use this theorem to quickly identify some
diagonalizable matrices.
|
Theorem 3 Suppose
that A is an  matrix and that A has n distinct
eigenvalues, then A is
diagonalizable.
|
Proof : By
Theorem 2 we know that the eigenvectors corresponding to each of the
eigenvalues are a linearly independent set and then by Theorem 1 above we know
that A will be diagonalizable.

We’ll close this section out with a nice theorem about
powers of diagonalizable matrices and the inverse of an invertible diagonalizable
matrix.
|
Theorem 4 Suppose
that A is a diagonalizable matrix
and that  then,
(a) If
k is any positive integer we have,

(b) If
all the diagonal entries of D are
non-zero then A is invertible and,

|
Proof :
(a) We’ll give
the proof for 
and leave it to you to generalize the proof
for larger values of k. Let’s start with the following.
So, we can see that,
We can finish this off by multiplying the left of this equation
by P and the right by 
to arrive at,
(b) First, we know that if the main
diagonal entries of a diagonal matrix are non-zero then the diagonal matrix is
invertible. Now, all that we need to
show is that,
This is easy enough to do.
All we need to do is plug in the fact that from part (a), using 
,
we have,
So, let’s do the following.
So, we’re done.
