Theory¶
Linear Transformations and Basis Vectors¶
Every matrix represents a linear transformation on a vector space. Usually we represent vector as a linear combination of the unit vectors $\hat{i}, \ \hat{j} \ \text{and} \ \hat{k}$. Hence this basis is written as
$\begin{bmatrix}
1 \ \ 0 \ \ 0 \\
0 \ \ 1 \ \ 0 \\
0 \ \ 0 \ \ 1 \\
\end{bmatrix}
$ (the identity matrix). Let's understand this with an example.
Suppose we have a basis $A$ given by the vectors $(2\hat{i}+3\hat{j})$ and $(4\hat{i}-\hat{j})$ and we want to represent the vector $\vec{v} = (3\hat{i}+2\hat{j})$ in basis $A$.
Let $\vec{v}$ be represented as $<X,Y>$ in basis $A$.
$$
\therefore X(2\hat{i}+3\hat{j})+Y(4\hat{i}-\hat{j}) = (3\hat{i}+2\hat{j})\\
\implies (2X+4Y)\hat{i}+(3X-Y)\hat{j} = (3\hat{i}+2\hat{j})
$$
This can also be written as
$$ \begin{bmatrix} 2 & 4 \\ 3 & -1 \\ \end{bmatrix} \begin{bmatrix} X \\ Y \end{bmatrix} = \begin{bmatrix} 3 \\ 2 \end{bmatrix} \\ \implies \begin{bmatrix} X \\ Y \end{bmatrix} = \begin{bmatrix} 2 & 4 \\ 3 & -1 \\ \end{bmatrix}^{-1} \begin{bmatrix} 3 \\ 2 \end{bmatrix} = \begin{bmatrix} 0.79 \\ 0.36 \end{bmatrix} $$ So the vector $\vec{v}$ represented in basis $A$ as $<0.79, 0.36>$.
Eigen values & vectors¶
A general vector changes both direction and magnitude when transformed by a matrix. But some special vectors only get stretched or compressed, without changing direction. Such vectors are eigen-vectors. And the scalar by which they get compressed or extended is called eien-value. Formally: $$ Av=\lambda v $$
- $v\neq 0$ is an eigenvector.
- $\lambda$ is its eigenvalue.
This equation shows:
The transformation acts like scaling along that vector.
Eigenvectors provide the natural coordinate axes in which the transformation becomes easiest to describe.
How to calculate eigen values & vectors?¶
Observe equation 3. $$Av=\lambda v\implies Av-\lambda v=0\\\implies (A-\lambda I)v=0\implies \text{det}(A-\lambda I)=0\\\therefore \begin{bmatrix} (x_{11}-\lambda) & x_{12} & x_{13} & \dots & x_{1n} \\ x_{21} & (x_{22}-\lambda) & x_{23} & \dots & x_{2n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ x_{m1} & x_{m2} & x_{m3} & \dots & (x_{mm}-\lambda) \end{bmatrix}=0$$
On expanding this, we get the characteristic polynomial of the matrix. On solving this characteristic polynomial, we get the eigen-values of the matrix. Each eigen-value corresponds to an unique eigen vector.
How to find the characteristic polynomial?¶
Let $A$ be an $n \times n$ matrix. The characteristic polynomial is: $$ p(\lambda) = \det(\lambda I - A) = \lambda^{n} + c_{1}\lambda^{n-1} + c_{2}\lambda^{n-2} + \cdots + c_{n} $$ We compute the coefficients $c_k$ using trace and principal minors.
Step-by-Step Algorithm¶
Step-1 — Initialize the polynomial¶
Start with: $c_0 = 1$ because the leading term of the characteristic polynomial is always $\lambda^n$.
Step-2 — Compute the trace for the first coefficient¶
The coefficient of $\lambda^{n-1}$ is: $c_1 = -\mathrm{trace}(A)$
This is the negative sum of the diagonal elements of $A$, which is equal to the sum of all of its eigen-vectors.
Step-3 — Compute principal minors of higher orders¶
For every $k = 2, 3, \ldots, n-1$:
- Take all combinations of $k$ distinct row/column indices $(i_1, i_2, \ldots, i_k)$
- For each such combination, form the principal minor.
- Compute the determinant of each such $k\times k$ minor.
- Add all of them together. This is coefficient of $\lambda^k$.
Step-4 — Compute the determinant of the entire matrix¶
The last coefficient (constant term) is the product of all the eigen-vectors $=$ determinant of the entire matrix.
Function to compute determinant¶
from typing import List
def determinant(matrix: List[List[float]]) -> float:
if len(matrix) != len(matrix[0]):
raise ArithmeticError("Parameter must be a square matrix")
if len(matrix) == 1:
return matrix[0]
if len(matrix) == 2:
return matrix[0][0]*matrix[1][1]-matrix[0][1]*matrix[1][0]
factor = 1.0
# copying the contents of matrix in a seperate array
x = [row[:] for row in matrix]
while len(x)>2:
# in case the first element of the matrix is 0
# we swap the row with another row whose first element is not 0
if x[0][0] == 0.0:
for i in range(len(x)):
# if we find such a row, we swap it with the first row and the factor becomes -1
if x[i][0] != 0:
for j in range(len(x)):
temp = x[i][j]
x[i][j] = x[0][j]
x[0][j] = temp
factor *= -1.0
break
# if we reach the end of the matrix and still don't find a suitable row, we return 0
if x[i][0] == 0 and i==len(x)-1:
return 0
factor *= x[0][0]
arr = []
for i in range(1,len(x)):
tempArr = []
for j in range(1,len(x)):
tempArr.append(x[i][j]-x[0][j]*x[i][0]/x[0][0])
arr.append(tempArr[:])
x = [row[:] for row in arr]
# at the end of the loop the size of x is always 2, so we can directly compute the result
return factor*(x[0][0]*x[1][1]-x[0][1]*x[1][0])
Function to calculate the characteristic polynomial¶
def minor(matrix, i, j):
"""Return the minor of the matrix after removing row i and column j."""
return [row[:j] + row[j+1:] for idx, row in enumerate(matrix) if idx != i]
def characteristic_polynomial(matrix):
"""
Generates the characteristic polynomial coefficients of a matrix.
Returns coefficients for:
det(λI - A) = λ^n + c1 λ^(n-1) + ... + cn
Where:
c1 = -trace(A)
cn = (-1)^n det(A)
"""
n = len(matrix)
# λ^n coefficient is always 1
coeffs = [1.0]
# c1 = -trace
trace = float(sum(matrix[i][i] for i in range(n)))
coeffs.append(-trace)
# Remaining coefficients require computing principal minors of each order
# a_k = (–1)^k * (sum of all principal minors of order k)
for k in range(2, n+1):
from itertools import combinations
total = 0.0
for rows in combinations(range(n), k):
sub = []
for r in rows:
sub.append([matrix[r][c] for c in rows])
total += determinant(sub)
coeffs.append(((-1)**k) * total)
return coeffs
Let's take a simple example. Find the characteristic polynomial of the matrix $A=\begin{bmatrix} 5 & 4 & 2\\ 1 & 3 & 1\\ 2 & 1 & 5\\ \end{bmatrix}$.
A = [
[5.0, 4.0, 2.0],
[1.0, 3.0, 1.0],
[2.0, 1.0, 5.0]
]
characteristic_polynomial(A)
[1.0, -13.0, 46.0, -48.00000000000001]
$\therefore$ The characteristic polynomial is $\lambda^3-13\lambda^2+46\lambda-48=0$.
On solving this equation, we get the eigen values as $\lambda_1=2, \lambda_2=3, \lambda_3=8$.
How to find eigen-vectors from eigen-values?¶
- For any given eigen-value $\lambda$, compute $A-\lambda I$.
- Multiply the matrices $A-\lambda I$ and $[x, y, z]^T$.
Example¶
Let's consider the eigen value $\lambda_2=3$.
$
\\\therefore \begin{bmatrix}
5-3 & 4 & 2\\
1 & 3-3 & 1\\
2 & 1 & 5-3\\
\end{bmatrix} = \begin{bmatrix}
2 & 4 & 2\\
1 & 0 & 1\\
2 & 1 & 2\\
\end{bmatrix}
\\\therefore \begin{bmatrix}
2 & 4 & 2\\
1 & 0 & 1\\
2 & 1 & 2\\
\end{bmatrix}\begin{bmatrix}
x\\
y\\
z\\
\end{bmatrix} = \begin{bmatrix}
0\\
0\\
0\\
\end{bmatrix}
$
From here we get 2 equations.
- $x+2y+z=0$
- $x+z=0$
Any vector satisfying any of these equations is the eigen-vector that corresponds to the eigen-value 3 for the basis $A$.