A matrix is an array of numbers or variables. It is organised into rows and columns. These form its dimensions.
An \((n \times m)\) matrix has \(n\) rows and \(m\) columns. Note that, while it is possible that \(n=m\), it is also possible that \(n \neq m\). When \(n=m\), we say that the matrix is a square matrix.
An \((n \times m)\) matrix takes the following form:
The scalar pre-product of this constant with this matrix is given by
\[\begin{split}
\begin{aligned}
c A & =c\left(\begin{array}{ccccc}
a_{11} & a_{12} & a_{13} & \cdots & a_{1 m} \\
a_{21} & a_{22} & a_{23} & \cdots & a_{2 m} \\
a_{31} & a_{32} & a_{33} & \cdots & a_{3 m} \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
a_{n 1} & a_{n 2} & a_{n 3} & \cdots & a_{n m}
\end{array}\right) \\[20pt]
& =\left(\begin{array}{ccccc}
c a_{11} & c a_{12} & c a_{13} & \cdots & c a_{1 m} \\
c a_{21} & c a_{22} & c a_{23} & \cdots & c a_{2 m} \\
c a_{31} & c a_{32} & c a_{33} & \cdots & c a_{3 m} \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
c a_{n 1} & c a_{n 2} & c a_{n 3} & \cdots & c a_{n m}
\end{array}\right)
\end{aligned}
\end{split}\]
The scalar post-product of the matrix with constant is given by
\[\begin{split}
\begin{aligned}
A c & =\left(\begin{array}{ccccc}
a_{11} & a_{12} & a_{13} & \cdots & a_{1 m} \\
a_{21} & a_{22} & a_{23} & \cdots & a_{2 m} \\
a_{31} & a_{32} & a_{33} & \cdots & a_{3 m} \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
a_{n 1} & a_{n 2} & a_{n 3} & \cdots & a_{n m}
\end{array}\right) c \\[20pt]
& =\left(\begin{array}{ccccc}
c a_{11} & c a_{12} & c a_{13} & \cdots & c a_{1 m} \\
c a_{21} & c a_{22} & c a_{23} & \cdots & c a_{2 m} \\
c a_{31} & c a_{32} & c a_{33} & \cdots & c a_{3 m} \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
c a_{n 1} & c a_{n 2} & c a_{n 3} & \cdots & c a_{n m}
\end{array}\right)
\end{aligned}
\end{split}\]
Note that \(c A=A c\). As such, we can just talk about the scalar product of a constant with a matrix, without specifying the order in which the multiplication takes place.
The following examples come from [Asano, 2013] (pp. 222-224).
Note that \(A+B=B+A\). (Exercise: Convince yourself of the validity of this claim.)
Example
Suppose that \(A\) is an \((m \times n)\) matrix, \(B\) is an \((n \times m)\) matrix and \(C\) is an \((n \times p)\) matrix, where \(m \neq n, m \neq p\) and \(n \neq p\).
Neither the matrix sum \(A+B\) nor the matrix sum \(B+A\) are defined.
Neither the matrix sum \(A+C\) nor the matrix sum \(C+A\) are defined.
Neither the matrix sum \(B+C\) nor the matrix sum \(C+B\) are defined.
The following examples come from [Asano, 2013] (pp. 222-224).
The standard matrix product is the dot, or inner, product of two matrices.
The dot product of two matrices is only defined for cases in which the number of columns of the first listed matrix is identical to the number of rows of the second listed matrix.
If the dot product is defined, the solution matrix will have the same number of rows as the first listed matrix and the same number of columns as the second listed matrix.
Suppose that \(X\) is an \((m \times n)\) matrix, \(Y\) is an \((n \times m)\) matrix and \(Z\) is an \((n \times p)\) matrix, where \(m \neq n, m \neq p\) and \(n \neq p\).
The matrix product \(X Y\) is defined and will be an \((m \times m)\) matrix.
The matrix product \(Y X\) is defined and will be an \((n \times n)\) matrix.
The matrix product \(X Z\) is defined and will be an \((m \times p)\) matrix.
The matrix products \(Z X, Y Z\) and \(Z Y\) are not defined.
Suppose that \(A\) is an \((n \times m)\) matrix that takes the following form:
The matrix product \(B A\) is undefined because the number of columns in \(B\) (which is three) does not equal the number of rows in \(A\) (which is two).
Suppose that a consumer whose preferences are defined over bundles of \(L\) commodities faces a price vector given by the row vector \(p=\left(p_{1}, p_{2}, \cdots, p_{L}\right)\) and chooses to purchase the quantities of each commodity that are given by the column vector
Suppose that \(A\) is an \((n \times m)\) matrix. The transpose of the matrix \(A\), which is denoted by \(A^{T}\), is the \((m \times n)\) matrix that is formed by taking the rows of \(A\) and turning them into columns, without changing their order. In other words, the \(i\) th column of \(A^{T}\) is the ith row of \(A\). This also means that the jth row of \(A^{T}\) is the \(j\) th column of \(A\).
Suppose that \(A\) is the \((n \times m)\) matrix that takes the following form:
In general, \(A^{T} \neq A\). There are two reasons for this:
First, unless \(A\) is a square matrix (that is, unless it has the same number of rows and columns), the dimensions of the matrix \(A^{T}\) will be different to the dimensions of the matrix \(A\).
Second, even if \(A\) is a square matrix, in general the \(i\)th row of \(A\) will not be identical to the \(i\)th column of \(A\). As such, in general we will have \(A^{T} \neq A\) even for square matrices.
If it is the case that \(A^{T}=A\), then we say that \(A\) is a symmetric matrix.
The \((n \times m)\) null matrix is the ADDITIVE identity matrix for the space of all \((n \times m)\) null matrices. This means that if \(A\) is an \((n \times m)\) matrix, then \(A+0=0+A=A\).
Suppose that \(A\) is an \((n \times m)\) matrix and 0 is the \((n \times m)\) null matrix.
The \((n \times m)\) matrix \(B\) is the additive inverse of \(A\) if and only if \(A+B=B+A=0\).
Suppose that
An identity matrix is a square matrix that has ones on the main (north-west to south-east) diagonal and zeros everywhere else. For example, the \((2 \times 2)\) identity matrix is
Only square matrices have any chance of having a multiplicative inverse. Some, but not all, square matrices will have a multiplicative inverse. Suppose that \(A\) is an \((n \times n)\) matrix and \(I\) is the \((n \times n)\) identity matrix.
The \((n \times n)\) matrix \(B\) is the multiplicative inverse (usually just referred to as the inverse) of \(A\) if and only if \(A B=B A=I\).
A square matrix that has an inverse is said to be non-singular.
A square matrix that does not have an inverse is said to be singular.
We will talk about methods for determining whether or not a matrix is non-singular later in this unit.
We will talk about methods for finding an inverse matrix, if it exists, later in this unit.
Useful fact: “The transpose of the inverse is equal to the inverse of the transpose”.
If \(A\) is a non-singular square matrix whose multiplicative inverse is \(A^{-1}\), then we have \(\left(A^{-1}\right)^{T}=\left(A^{T}\right)^{-1}\).
A matrix \(A\) is said to be idempotent if and only if \(A A=A\).
Clearly a NECESSARY condition for matrix to be idempotent is that \(A\) be a square matrix.
Exercise: Explain why
However, this is NOT a SUFFICIENT condition for a matrix to be idempotent. In general, \(A A \neq A\), even for square matrices.
Two examples of idempotent matrices that you have already encountered are square null matrices and identity matrices.
We will shortly encounter two more examples. These are the Hat matrix \((P)\) and the residual-making matrix \((M=I-P)\) from statistics and econometrics.
Econometric application: the classical linear regression model#
One of simplest models that you will encounter in statistics and econometrics is the classical linear regression model (CLRM). This model takes the form
\[
Y=X \beta+\epsilon
\]
where \(Y\) is an \((n \times 1)\) vector of \(n\) observations on a single dependent variable, \(X\) is an \((n \times k)\) matrix of \(n\) observations on \(k\) independent variables, \(\beta\) is a \((k \times 1)\) vector of unknown parameters and \(\epsilon\) is an \((n \times 1)\) vector of random disturbances.
In the CLRM, the joint distribution of the random disturbances, conditional on \(X\), is given by
\[
\epsilon \mid X \sim N\left(0, \sigma^{2} I\right)
\]
where 0 is an \((n \times 1)\) null vector, \(l\) is an \((n \times n)\) identity matrix and \(\sigma^{2}\) is an unknown parameter.
The ordinary least squares estimator (and, in the case of the CLRM, maximum likelihood estimator) of the parameter vector \(\beta\) in the CLRM is given by
\[
b=\left(X^{T} X\right)^{-1} X^{T} Y
\]
The hat matrix for the CLRM is given by
\[
P=X\left(X^{T} X\right)^{-1} X^{T}
\]
The residual-making matrix for the CLRM is given by
We build the definition of the determinants of larger matrices from \(2 \times 2\) case. Think of the next definitions as a ‘induction step’
Definition
Consider an \(n \times n\) matrix \(A\).
Denote \(A_{ij}\) a \((n-1) \times (n-1)\) submatrix of \(A\), obtained by deleting the \(i\)-th row and \(j\)-th column of \(A\). Then
the \((i,j)\)-th minor of \(A\) denoted \(M_{ij}\) is
\[
M_{ij} = \det(A_{ij})
\]
the \((i,j)\)-th cofactor of \(A\) denoted \(C_{ij}\)
If some row or column of \(A\) is added to another one after being multiplied by a scalar \(\alpha \ne 0\),
then the determinant of the resulting matrix is the same as the determinant of \(A\).
In other words, the determinant is invariant under elementary row or column operations of type 3 (see next lecture).