Graduate Quantitative Economics and Datascience
reg y x, robust
”This section uses the following packages:
Big-O Notation
For a function \(f(N)\) and a positive constant \(C\), we say \(f(N)\) is \(O(g(N))\), if there exist positive constants \(C\) and \(N_0\) such that:
\[ 0 \leq f(N) \leq C \cdot g(N) \quad \text{for all } N \geq N_0 \]
Machine Epsilon
For a given datatype, \(\epsilon\) is defined as \(\epsilon = \min_{\delta > 0} \left\{ \delta : 1 + \delta > 1 \right\}\)
print(f"machine epsilon for float64 = {np.finfo(float).eps}")
print(f"1 + eps/2 == 1? {1.0 + 1.1e-16 == 1.0}")
print(f"machine epsilon for float32 = {np.finfo(np.float32).eps}")
machine epsilon for float64 = 2.220446049250313e-16
1 + eps/2 == 1? True
machine epsilon for float32 = 1.1920928955078125e-07
x = np.array([1, 2, 3]) # Calculating different ways (in order of preference)
print(np.sqrt(sum(xval**2 for xval in x))) # manual with comprehensions
print(np.sqrt(np.sum(np.square(x)))) # broadcasts
print(norm(x)) # built-in to numpy norm(x, ord=2) alternatively
print(f"||x||_2^2 = {norm(x)**2} = {x.T @ x} = {np.dot(x, x)}")
3.7416573867739413
3.7416573867739413
3.7416573867739413
||x||_2^2 = 14.0 = 14 = 14
solve
with a calculation of an inverseWe can think of solving a system as finding the linear combination of columns of \(A\) that equal \(b\)
Hence, can solve \(A x = b\) for any \(b \in \mathbb{R}^2\) since the column space is the entire space \(\mathbb{R}^2\)
On the other hand, note
So we can only solve \(A x = b\) for \(b \propto \begin{bmatrix} 1 \\ 2 \end{bmatrix} \propto \begin{bmatrix} 2 \\ 4 \end{bmatrix}\)
A = np.array([[1, 2],
[2, 4]])
# An (expensive) way to check if A is singular is if det(A) = 0
print(det(A) == 0.0)
print(matrix_rank(A) != A.shape[0]) # or check rank
# Check before inverting or use exceptions
try:
inv(A)
print("Matrix is not singular (invertible).")
except np.linalg.LinAlgError:
print("Matrix is singular (non-invertible).")
True
True
Matrix is singular (non-invertible).
eps, K = 1e-8, 100000
A = np.array([[1, 2],
[1 + eps, 2 + eps]]) # not 1 + eps, 2(1+eps)!
print(f"det(A)={det(A):.5g}, det(K*A)={det(K*A):.5g}")
print(f"cond(A)={cond(A):.5g}, cond(K*A)={cond(K*A):.5g},")
print(f"det(inv(A))={det(inv(A)):.5g}, cond(inv(A))={cond(inv(A)):.5g}")
det(A)=-1e-08, det(K*A)=-100
cond(A)=1e+09, cond(K*A)=1e+09,
det(inv(A))=-1e+08, cond(inv(A))=1e+09
solve
as wellA = np.array([[0, 2],
[3, 4]])
B = np.array([[2, 3],
[1, 2]]) # [2,1] and [3,2] as columns
# or: B = np.column_stack([np.array([2, 1]),np.array([3,2])])
X = solve(A, B) # Solve AX = B for X
print(X)
print(f"Checking: A*{X[:,0]} = {A@X[:, 0]} = {B[:,0]}, column of B")
[[-1. -1.33333333]
[ 1. 1.5 ]]
Checking: A*[-1. 1.] = [2. 1.] = [2 1], column of B
The \(P\) matrix is a permutation matrix of “pivots” the others are triangular
solve
algorithm (without more structure)inv
\[ \begin{aligned} U x &= b\\ U &\equiv \begin{bmatrix} 3 & 1 \\ 0 & 2 \\ \end{bmatrix}, \quad b = \begin{bmatrix} 7 \\ 2 \\ \end{bmatrix} \end{aligned} \]
Solving bottom row for \(x_2\)
\[ 2 x_2 = 2,\quad x_2 = 1 \]
Move up a row, solving for \(x_1\), substituting for \(x_2\)
\[ 3 x_1 + 1 x_2 = 7,\quad 3 x_1 + 1 \times 1 = 7,\quad x_1 = 2 \]
Generalizes to many rows. For \(L\) it is “forward substitution”
Another common matrix type are symmetric, \(A = A^{T}\)
A = np.array([[1, 2],
[2, 5]])
x = np.array([0, 1]) # can't really check for all x
print(f"x^T A x = {x.T @ A @ x}")
x^T A x = 5
x^T A x = 0
The vector norm \(||x||_2\) is an important feature in many applications
\[ \min_{\beta} ||y - X \beta||_2 \]
Matrix structure or decompositions of \(A\) help us better understand the \(f(x)\) mapping
For a square \(A\), an eigenvector \(x\) and eigenvalue \(\lambda\) satisfy \[ A x = \lambda x \]
\(A\in\mathbb{R}^{N\times N}\) has \(N\) eigenvalue/eigenvector pairs, possible multiplicity of \(\lambda\)
Intuition: \(x\) is a direction \(A x \propto x\) and \(\lambda\) says how much it “stretches”
You cannot check \(x^T A x > 0\) for all \(x\). Check if “stretching” is positive
A = np.array([[3, 2],
[2, 1]])
# A_eigs = np.real(eigvals(A)) # symmetric matrices have real eigenvalues
A_eigs = eigvalsh(A) # specialized for symmetric/hermitian matrices
print(A_eigs)
is_positive_definite = np.all(A_eigs > 0)
is_positive_semi_definite = np.all(A_eigs >= 0) # or eigvals(A) >= -eps
print(f"pos-def? {is_positive_definite}")
print(f"pos-semi-def? {is_positive_semi_definite}")
[-0.23606798 4.23606798]
pos-def? False
pos-semi-def? False
The simplest positive-semi-definite (but not posdef) matrix is
A_eigs = eigvalsh(np.array([[1, 0],
[0, 0]]))
print(A_eigs)
is_positive_definite = np.all(A_eigs > 0)
is_positive_semi_definite = np.all(A_eigs >= 0) # or eigvals(A) >= -eps
print(f"pos-def? {is_positive_definite}")
print(f"pos-semi-def? {is_positive_semi_definite}")
[0. 1.]
pos-def? False
pos-semi-def? True
A = np.array([[2, 1],
[1, 3]])
Lambda, Q = eig(A)
print(f"eigenvectors are column-by-column in Q =\n{Q}")
print(f"Q^T Q =\n{Q.T @ Q}") # should be I
print(f"Q^-1 == Q^T? {np.allclose(np.linalg.inv(Q), Q.T)}") # should be True
print(f"eigenvalues are in Lambda = {Lambda}")
print(f"Q Lambda Q^T =\n{Q @ np.diag(np.real(Lambda)) @ Q.T}")
eigenvectors are column-by-column in Q =
[[-0.85065081 -0.52573111]
[ 0.52573111 -0.85065081]]
Q^T Q =
[[1.0000000e+00 1.2127222e-17]
[1.2127222e-17 1.0000000e+00]]
Q^-1 == Q^T? True
eigenvalues are in Lambda = [1.38196601+0.j 3.61803399+0.j]
Q Lambda Q^T =
[[2. 1.]
[1. 3.]]
A = np.array([[5, 1],
[2, 3]])
Lambda, Q = eig(A)
print(f"eigenvectors are column-by-column in Q =\n{Q}")
print(f"Q^-1 == Q^T? {np.allclose(np.linalg.inv(Q), Q.T)}") # should be False
print(f"eigenvalues are in Lambda = {Lambda}")
print(f"Q Lambda inv(Q) =\n{Q @ np.diag(np.real(Lambda)) @ np.linalg.inv(Q)}")
eigenvectors are column-by-column in Q =
[[ 0.80689822 -0.34372377]
[ 0.59069049 0.9390708 ]]
Q^-1 == Q^T? False
eigenvalues are in Lambda = [5.73205081+0.j 2.26794919+0.j]
Q Lambda inv(Q) =
[[5. 1.]
[2. 3.]]
\[ \rho(A) = \max_{\lambda \in \Lambda}|\lambda| \]
inv
is a bad idea due to poor conditioning\[ \begin{aligned} c_0 + c_1 x_1 + c_2 x_1^2 + \ldots + c_N x_1^N &= y_1\\ \vdots & \\ c_0 + c_1 x_N + c_2 x_N^2 + \ldots + c_N x_N^N &= y_N\\ \end{aligned} \]
\[ A \equiv \begin{bmatrix} 1 & x_0 & x_0^2 & \ldots & x_0^N\\ \vdots & \vdots &\vdots & \vdots & \vdots\\ 1 & x_N & x_N^2 & \ldots & x_N^N \end{bmatrix} \]
\[ A c = y \]
solve
and inv
N = 5
x = np.linspace(0.0, 10.0, N + 1)
y = np.exp(x) # example function to interpolate
A = np.array([[x_i**n for n in range(N + 1)] for x_i in x]) # or np.vander
c = solve(A, y)
c_inv = inv(A) @ y
print(f"error = {norm(A @ c - y, np.inf)}, \
error using inv(A) = {norm(A @ c_inv - y, np.inf)}")
print(f"cond(A) = {cond(A)}")
error = 2.2737367544323206e-11, error using inv(A) = 1.367880031466484e-09
cond(A) = 564652.3214053963
N = 10
x = np.linspace(0.0, 10.0, N + 1)
y = np.exp(x) # example function to interpolate
A = np.array([[x_i**n for n in range(N + 1)] for x_i in x]) # or np.vander
c = solve(A, y)
c_inv = inv(A) @ y # Solving with inv(A) instead of solve(A, y)
print(f"error = {norm(A @ c - y, np.inf)}, \
error using inv(A) = {norm(A @ c_inv - y, np.inf)}")
print(f"cond(A) = {cond(A)}")
error = 5.802576197311282e-10, error using inv(A) = 4.1591702029109e-06
cond(A) = 4462833495403.007
N = 20
x = np.linspace(0.0, 10.0, N + 1)
y = np.exp(x) # example function to interpolate
A = np.array([[x_i**n for n in range(N + 1)] for x_i in x]) # or np.vander
c = solve(A, y)
c_inv = inv(A) @ y # Solving with inv(A) instead of solve(A, y)
print(f"error = {norm(A @ c - y, np.inf)}, \
error using inv(A) = {norm(A @ c_inv - y, np.inf)}")
print(f"cond(A) = {cond(A):.4g}")
error = 1.1368683772161603e-10, error using inv(A) = 2511.4640634151983
cond(A) = 2.381e+24
solve
, which is faster and can often solve ill-conditioned problems. Rarely use inv
, and only when you know the problem is well-conditionedN = 40
x = np.linspace(-1, 1, N+1) # Or any other range of x values
A = np.array([[np.polynomial.Chebyshev.basis(n)(x_i) for n in range(N+1)] for x_i in x])
A_monomial = np.array([[x_i**n for n in range(N + 1)] for x_i in x]) # or np.vander
print(f"cond(A) = {cond(A):.4g}, cond(A_monomial) = {cond(A_monomial):.4g}")
cond(A) = 3.64e+09, cond(A_monomial) = 2.149e+18