Announcements & Reminders
Online Test on Monday April 22
Next two lectures will be given by my postdoc Esben Scriver Andersen
📖 Systems of linear equations#
⏱ | words
Sources and reading guide
[Sydsæter and Hammond, 2006] Chapter 15, in particular 15.6 [Sydsæter and Hammond, 2006] Chapter 16, in particular 16.5, 16.6
[Anton, 1987]: Chapters 1-3 (pp. 1-142).
[Asano, 2013]: Appendix A (pp. 218-242).
[Basilevsky, 1983]: Chapters 1-4 (pp. 1-187).
[Bradley, 2008]: Chapter 9 (pp. 475-536).
[Chiang and Wainwright, 2005]: Chapters 4 and 5.
[Haeussler Jr and Paul, 1987]: Chapter 8 (pp. 238-309).
[Shannon, 1995]: Chapters 4 and 5 (pp. 121-230).
[Simon and Blume, 1994]: Chapters 6-11 and 26-28 (pp. 107-250 and 719-799).
What is a system of linear equations#
Definition
A general system of linear equations can be written as
More often we write the system in matrix form
A solution to the system of equations is a vector \(x\) that satisfies \(Ax=b\).
To solve the system of equation, similar to solving an equation, is to find all elements of the set of the solutions, or establish that this set is empty.
Gaussian elimination#
Gaussian elimination is a simple manual algorithm for solving linear systems
Think long division: you can do solve it using computer or calculator, but handy to be able to do it by hand too
Definition
Augmented matrix notation for the system of linear equations has an extra column made by the RHS of the system:
Gaussian elimination is a method for solving systems of linear equations which consists of performing operations on the rows of the augmented matrix that represents the system.
The goal is to transform this matrix into an upper triangular or row-echelon form
From there, solutions can be easily obtained through back substitution
Definition: Elementary row operations
Let \(R_i\) denote a row of the augmented matrix of the system
Swap two rows \(R_i \leftrightarrow R_j\)
Multiply a row by a nonzero scalar \(C \cdot R_i \rightarrow R_i\), \(C>0\)
Add a multiple of one row to another row \(kR_i + R_j \rightarrow R_j\)
Example
Let’s solve the following system of linear equations:
The augmented matrix corresponding to the system is:
Perform elementary row operations. Goal: Form an upper triangular matrix.
Step 1: Eliminate \(x\) from the second row \(R_2 - 3R_1 \rightarrow R_2\)
Step 2: Eliminate \(x\) from the third rows \(R_3 - 2R_1 \rightarrow R_3\)
Step 3: Make the pivot in the second row \(-\frac{1}{3}R_2 \rightarrow R_2\)
Step 4: Make the pivot in the third row \(\frac{1}{2}R_3 \rightarrow R_3\)
(\(\star\)) Now we have a triangular matrix, and we can perform back substitution to find the values of \(z\), \(y\), and \(x\) in return.
The solution is \((x,y,z) = (2,1,1)\)
We could continue elementary row operations after (\(\star\)) to get the identity matrix in the first block. These steps are referred to as Gauss-Jordan elimination steps
Step 5: Eliminate \(z\) in the second row \(R_2 + R_3 \rightarrow R_2\)
Step 6: Eliminate \(z\) in the first row \(R_1 - R_3 \rightarrow R_1\)
Step 6: Eliminate \(y\) in the first row \(R_1 - 2 R_2 \rightarrow R_1\)
At this stage the system is trivial, each equation contains a single variable with 1 as a coefficient. We can just read out the solution \((x,y,z) = (2,1,1)\)
Definition
A matrix is said to be in row echelon form if it satisfies the following conditions:
all rows having only zero entries are at the bottom of the matrix
leading entry (first non-zero entry) of each row is to the right of the leading entry of the row above it
A matrix is said to be in reduced row echelon form if it satisfies one additional condition:
the leading entry in each row is 1
each column containing a leading 1 has all other entries equal to zero
The first nonzero entry of each row is called a pivot
Gaussian elimination leads to the row echelon form of the matrix
Gaussian-Jordan elimination leads to the reduced row echelon form of the matrix
Fact
Every matrix has a unique reduced row echelon form.
Fact
Every matrix can be transformed to echelon and reduced echelon form using Gaussian elimination.
The difference between row echelon and upper-triangular:
upper-triangular: all entries below the diagonal are zero, square matrix (right)
row echelon: defined for any shape, zero rows must be on the bottom (left)
System with no solutions#
Example
The last row corresponds to the equation \(0 = 1/6\), which is a contradiction. Therefore, the system has no solutions.
Fact
If the last row of the row-echelon form of the augmented matrix corresponds to the equation \(0 = c\), where \(c \neq 0\), then the system of linear equations has no solutions.
System with non-unique solutions#
Example: no unique solution
Note that the last row with all zero places no restriction on \(z\) which we can let equal to parameter \(t\), \(z = t\).
from \(R_2\): \(y = t\)
from \(R_1\): \(x = 5 - 3 t\)
Thus, the solution in parametric form:
where \(t\) is a parameter.
Fact
The variables corresponding to the columns without a pivot in the row-echelon form of the augmented matrix referred to as free variables play the role of parameters in the non-unique solution of the system of linear equations.
Inverse by Gaussian row operations#
In order to find the inverse of the \(P\) matrix, we make the following argument. By definition, \(PP^{-1} = I\), and therefore the columns of the unknown matrix \(P^{-1}\) (denoted below \(p'_i\), \(i=1,2,3\)) are solutions of the following three systems of equations:
We can find the solutions of all three systems by Gaussian elimination performing elementary row operations on an ``extra’’ augmented matrix with three columns in place of the right-hand side.
Example
Let’s find the inverse of the matrix
Therefore, the inverse of the \(P\) matrix is given by
Solving a linear system using matrix inverse#
Assuming that the matrix \(A\) is invertible, the solution to the system of linear equations \(Ax=b\) can be found by multiplying both sides by the inverse of \(A\):
Example
Determinants under Gaussian elimination#
Fact
Elementary operations on the rows of a square matrix have the following effect on its determinant:
Swap two rows \(R_i \leftrightarrow R_j\) \(\Longrightarrow\) Determinant changes sign
Multiply a row by a nonzero scalar \(C \cdot R_i \rightarrow R_i\), \(C>0\) \(\Longrightarrow\) Determinant is multiplied by the same scalar \(C\)
Add a multiple of one row to another row \(kR_i + R_j \rightarrow R_j\) \(\Longrightarrow\) Determinant does not change!
Example
Inverse by determinants#
Recall from the previous lecture
Definition
Consider an \(n \times n\) matrix \(A\). Denote \(A_{ij}\) a \((n-1) \times (n-1)\) submatrix of \(A\), obtained by deleting the \(i\)-th row and \(j\)-th column of \(A\). Then
the \((i,j)\)-th minor of \(A\) denoted \(M_{ij}\) is
the \((i,j)\)-th cofactor of \(A\) denoted \(C_{ij}\)
Definition
The adjoint of an \((n \times m)\) matrix \(A\) (adjugate of \(A\)), denoted \(\mathrm{adj}(A)\) is the matrix of cofactors of \(A\), transposed:
Fact
Any square matrix \(A\) with a non-zero determinant \(\det A\), has a unique inverse \(A^{-1}\) which satisfies \(AA^{-1}=A^{-1}A=I\). It is given by
if \(\det A = 0\) matrix \(A\) does not have an inverse
Example
Therefore the matrix \(A\) does not have an inverse.
Example
Solving linear systems by determinants#
Fact: Cramer’s Rule
A system of linear equations \(Ax=b\) where \(A\) is \((n \times n)\) matrix has a unique solution if and only \(A\) is nonsingular (\(\det A \neq 0\)).
The solution is given by
where \(D_i\) is the determinant of the matrix obtained by replacing the \(i\)-th column of \(A\) by the column vector \(b\)
Proof
Idea of the proof: use the determinant formula for the inverse of \(A\), solve the system by the inverse, and simplify
Example
Therefore, the solution is \((x,y,z) = (\frac{D_1}{19},\frac{D_2}{19},\frac{D_3}{19}) = (1,-2,2)\)
Notes from the lecture#
Further reading and self-learning
Excellent visualizations of concepts partially covered in this lecture, strongly recommended for further study 3Blue1Brown: Essence of linear algebra
Wiki resources: Gaussian elimination performing elementary row operations