In the previous section we used augmented matrices to denote
a system of linear equations. In this
section we’re going to start looking at matrices in more generality. A matrix is nothing more than a rectangular
array of numbers and each of the numbers in the matrix is called an entry.
Here are some examples of matrices.
The size of a
matrix with n rows and m columns is denoted by 
. In denoting the size of a matrix we always
list the number of rows first and the number of columns second.
|
Example 1 Give
the size of each of the matrices above.
Solution


In this matrix the number of rows is equal to the number
of columns. Matrices that have the
same number of rows as columns are called square matrices.

This matrix has a single column and is often called a column matrix.

This matrix has a single row and is often called a row matrix.

Often when dealing with  matrices we will drop the surrounding
brackets and just write -2.
|
Note that sometimes column matrices and row matrices are
called column vectors and row vectors respectively. We do need to be careful with the word vector
however as in later chapters the word vector will be used to denote something
much more general than a column or row matrix.
Because of this we will, for the most part, be using the terms column
matrix and row matrix when needed instead of the column vector and row vector.
There are a lot of notational issues that we’re going to
have to get used to in this class.
First, upper case letters are generally used to refer to matrices while
lower case letters generally are used to refer to numbers. These are general rules, but as you’ll see
shortly there are exceptions to them, although it will usually be easy to
identify those exceptions when they happen.
We will often need to refer to specific entries in a matrix and
so we’ll need a notation to take care of that.
The entry in the ith
row and jth column of the
matrix A is denoted by,
In the first notation the lower case letter we use to denote
the entries of a matrix will always match with the upper case letter we use to
denote the matrix. So the entries of the
matrix B will be denoted by 
.
In both of these notations the first (left most) subscript
will always give the row the entry is in and the second (right most) subscript
will always give the column the entry is in.
So, 
will be the entry in the 4th row
and 9th column of C (which
is assumed to be a matrix since it’s an upper case letter…).
Using the lower case notation we can denote a general 
matrix, A,
as follows,
We don’t generally subscript the size of the matrix as we did
in the second case, but on occasion it may be useful to make the size clear and
in those cases we tend to subscript it as shown in the second case.
The notation above for a general matrix is fairly cumbersome
so we’ve also got some much more compact notation that we’ll use when we
can. When possible we’ll use the
following to denote a general matrix.
The first two we tend to use when we need to talk about the
general entry of a matrix (such as certain formulas) but don’t really care what
that entry is. Also, we’ll denote the
size if it’s important or needed for whatever we’re doing, but otherwise we’ll
not bother with the size. The third
notation is really nothing more than the standard notation with the size
denoted. We’ll use this only when we
need to talk about a matrix and the size is important but the entries
aren’t. We won’t run into this one too
often, but we will on occasion.
We will be dealing extensively with column and row matrices
in later chapters/sections so we need to take care of some notation for
those. There are the main exception to
the upper case/lower case convention we adopted earlier for matrices and their
entries. Column and row matrices tend to
be denoted with a lower case letter that has either been bolded or has an arrow
over it as follows,
In written documents, such as this, column and row matrices
tend to be in bold face while on the chalkboard of a classroom they tend to get
arrows written over them since it’s often difficult on a chalkboard to
differentiate a letter that’s in bold from one that isn’t.
Also, notice with column and row matrices the entries are
still denoted with lower case letters that match the letter that represents the
matrix and in this case since there is either a single column or a single row
there was no reason to double subscript the entries.
Next we need to get a quick definition out of the way for
square matrices. Recall that a square
matrix is a matrix whose size is 
(i.e.
it has the same number of rows as columns).
In a square matrix the entries 
(see the shaded portion of the matrix below)
are called the main diagonal.

The next topic that we need to discuss in this section is
that of partitioned matrices and submatrices.
Any matrix can be partitioned
into smaller submatrices simply by
adding in horizontal and/or vertical lines between selected rows and/or
columns.




























The previous example showed three of the many possible ways
to partition up the matrix. There are,
of course, many other ways to partition this matrix. We won’t be partitioning up too many matrices
here, but we will be doing it on occasion, so it’s a useful idea to
remember. Also note that when we do
partition up a matrix into its column/row matrices we will generally put in the
bars separating the columns/rows as we’ve done here to indicate that we’ve got
a partitioned matrix.
To close out this section we’re going to introduce a couple
of special matrices that we’ll see show up on occasion.
The first matrix is the zero
matrix. The zero matrix is pretty
much what the name implies. It is an 
matrix whose entries are all zeroes. The notation we’ll use for the zero matrix is

for a general zero matrix or 
for a zero column or row matrix. Here are a couple of zero matrices just so we
can say we have some in the notes.
If the size of a column or row zero matrix is important we
will sometimes subscript the size on those as well just to make it clear what
the size is. Also, if the size of a full
zero matrix is not important or implied from the problem we will drop the size
from 
and just denote it by 0.
The second special matrix we’ll look at in this section is
the identity matrix. The identity matrix is a square 
matrix usually denoted by 
or just I
if the size is unimportant or clear from the context of the problem. The entries on the main diagonal of the
identity matrix are all ones and all the other entries in the identity matrix
are zeroes. Here are a couple of
identity matrices.
As we’ll see identity matrices will arise fairly
regularly. Here is a nice theorem about
the reduced row-echelon form of a square matrix and how it relates to the
identity matrix.
Proof : This is a
simple enough theorem to prove that we may as well. Let’s suppose that B is the reduced row-echelon form of the matrix. If B
has at least one row of all zeroes we are done so let’s suppose that B does not have a row of all
zeroes. This means that every row has a
leading 1 in it.
Now, we know that the leading 1 of a row must be to the right
of the leading 1 of the row immediately above it. Because we are assuming that B is square and doesn’t have any rows of
all zeroes we can actually locate each of the leading 1’s in B.
First, let’s suppose that the leading 1 in the first row is
NOT 
(i.e.

).
The next possible location of the leading 1 in the first row would then
be 
. So, let’s suppose that this is where the
leading 1 is. So, upon assuming this we
can say that B must have the
following form.
Now, let’s assume the best possible scenario happens. That is the leading 1 of each of the lower
rows is exactly one column to the right of the leading 1 above it. This however, leads us to instant
problems. Because our first leading 1 is
in the second column by the time we reach the n-1st row our leading 1 will be in the nth column and this will in
turn force the nth row to
be a row of all zeroes which contradicts our initial assumption. If you’re not sure you believe this consider
the 
case.
Sure enough a row of all zeroes in the 4th row.
Now, we assumed the best possible scenario for the leading
1’s in the lower rows and ran into problems.
If the leading 1 jumps to the right say 2 columns (or 3 or 4, etc.) we will run into the same kind of
problem only we’ll end up with more than one row of all zeroes.
Likewise if the leading 1 in the first row is in any of 
we will have the same problem. So, in order to meet the assumption that we
don’t have any rows of all zeroes we know that the leading 1 in the first row
must be at 
.
Using a similar argument to that above we can see that if
the leading 1 on any of the lower rows jumps to the right more than one column
we will have a leading 1 in the nth
column prior to hitting the nth
row. This will in turn force at least
the nth row to be a row of
all zeroes which will again contradict our initial assumption.
Therefore we know that the leading one in the first row is
at 
and the only hope of not having a row of all
zeroes at the bottom is to have the leading 1’s of a row be exactly one column
to the right of the leading 1 of the row above it. This means that the leading 1 in the second
row must be at 
,
the leading 1 in the third row must be at 
,
etc.
Eventually we’ll hit the nth
row and in this row the leading 1 must be at 
.
Therefore the leading 1’s of B must be on the diagonal and because B is the reduced row-echelon form of A we also know that all the entries above and below the leading 1’s
must be zeroes. This however, is exactly

. Therefore, if B does not have a row of all zeroes in it then we must have that 
.
