Direct Methods and Matrix Factorizations (Julia)

Machine Learning Fundamentals for Economists

Jesse Perla

jesse.perla@ubc.ca

University of British Columbia

Overview

Motivation

In preparation for the ML lectures we cover some core numerical linear algebra concepts
Many of these are directly useful
- e.g. solving large systems of equations or as building blocks in bigger algorithms

Packages and Materials

See QuantEcon Numerical Linear Algebra and associated notebooks

using LinearAlgebra, Statistics, BenchmarkTools, SparseArrays, Random
using Plots
Random.seed!(42);  # seed random numbers for reproducibility

Precompiling packages...
    534.4 ms  ✓ Statistics
  1 dependency successfully precompiled in 1 seconds
Precompiling packages...
   1697.6 ms  ✓ StructUtils
   5338.0 ms  ✓ JSON
   1287.8 ms  ✓ BenchmarkTools
  3 dependencies successfully precompiled in 8 seconds. 14 already precompiled.
Precompiling packages...
   1124.8 ms  ✓ QuartoNotebookWorkerJSONExt (serial)
  1 dependency successfully precompiled in 1 seconds
Precompiling packages...
    816.6 ms  ✓ LaTeXStrings
   2832.1 ms  ✓ Format
   2044.5 ms  ✓ Requires
   3649.4 ms  ✓ OrderedCollections
   1351.8 ms  ✓ Statistics → SparseArraysExt
   1319.6 ms  ✓ JpegTurbo_jll
   2166.0 ms  ✓ Ghostscript_jll
   5460.2 ms  ✓ MacroTools
   4314.2 ms  ✓ Latexify
    730.3 ms  ✓ Latexify → SparseArraysExt
  10 dependencies successfully precompiled in 13 seconds. 17 already precompiled.
Precompiling packages...
    724.5 ms  ✓ StatsAPI
    870.9 ms  ✓ Measures
   1023.6 ms  ✓ Contour
   1203.3 ms  ✓ DocStringExtensions
   1221.0 ms  ✓ URIs
    932.1 ms  ✓ SimpleBufferStream
    796.6 ms  ✓ PtrArrays
   4132.7 ms  ✓ Grisu
    881.2 ms  ✓ DataAPI
    735.1 ms  ✓ Reexport
   1188.8 ms  ✓ TranscodingStreams
   3934.2 ms  ✓ NaNMath
   5717.5 ms  ✓ IrrationalConstants
    667.3 ms  ✓ BitFlags
    977.4 ms  ✓ TensorCore
    923.9 ms  ✓ StableRNGs
   2046.4 ms  ✓ ConcurrentUtilities
   1052.4 ms  ✓ DelimitedFiles
   3867.2 ms  ✓ Unzip
   1016.6 ms  ✓ LoggingExtras
   1292.1 ms  ✓ ExceptionUnwrapping
   8106.7 ms  ✓ FixedPointNumbers
   4187.2 ms  ✓ DataStructures
   1339.8 ms  ✓ Xorg_libICE_jll
   2716.4 ms  ✓ UnicodeFun
   1531.2 ms  ✓ Libffi_jll
   1276.2 ms  ✓ Libuuid_jll
   1426.0 ms  ✓ LLVMOpenMP_jll
   1369.3 ms  ✓ fzf_jll
   1359.9 ms  ✓ Ogg_jll
   1390.2 ms  ✓ mtdev_jll
   1333.3 ms  ✓ x265_jll
   1403.5 ms  ✓ x264_jll
   1198.8 ms  ✓ Graphite2_jll
   1685.9 ms  ✓ FriBidi_jll
   1145.2 ms  ✓ EpollShim_jll
   1251.9 ms  ✓ Xorg_libXau_jll
   1466.9 ms  ✓ libpng_jll
   1365.6 ms  ✓ LAME_jll
   1387.5 ms  ✓ Xorg_libXdmcp_jll
   1561.8 ms  ✓ libaom_jll
   1576.6 ms  ✓ MbedTLS_jll
   1655.3 ms  ✓ Zstd_jll
   1527.3 ms  ✓ Expat_jll
   1173.3 ms  ✓ LZO_jll
   1406.1 ms  ✓ Opus_jll
   1312.6 ms  ✓ Xorg_xtrans_jll
   1425.7 ms  ✓ eudev_jll
   1336.6 ms  ✓ Libmount_jll
   1284.0 ms  ✓ libfdk_aac_jll
   1464.8 ms  ✓ Bzip2_jll
   1322.4 ms  ✓ LERC_jll
   1461.0 ms  ✓ libevdev_jll
   1633.9 ms  ✓ XZ_jll
    966.5 ms  ✓ AliasTables
   1743.9 ms  ✓ GettextRuntime_jll
    786.8 ms  ✓ Showoff
   1134.5 ms  ✓ CodecZlib
   3994.2 ms  ✓ RecipesBase
   1472.0 ms  ✓ LogExpFunctions
   3396.7 ms  ✓ Missings
   1201.0 ms  ✓ SortingAlgorithms
   1263.9 ms  ✓ Xorg_libSM_jll
   4620.2 ms  ✓ OpenSSL
   1536.5 ms  ✓ Pixman_jll
   1362.2 ms  ✓ JLFzf
   5131.9 ms  ✓ ColorTypes
   1520.9 ms  ✓ libvorbis_jll
   1463.8 ms  ✓ Dbus_jll
   2493.8 ms  ✓ MbedTLS
   4264.3 ms  ✓ Xorg_libxcb_jll
   1556.1 ms  ✓ FreeType2_jll
   1881.5 ms  ✓ Wayland_jll
   1391.5 ms  ✓ Libtiff_jll
   1516.5 ms  ✓ libinput_jll
    864.5 ms  ✓ ColorTypes → StyledStringsExt
   2182.4 ms  ✓ Glib_jll
   6838.2 ms  ✓ ColorVectorSpace
   8448.2 ms  ✓ StatsBase
   1158.6 ms  ✓ Xorg_xcb_util_jll
   1333.2 ms  ✓ Xorg_libX11_jll
   1973.5 ms  ✓ Fontconfig_jll
   1401.5 ms  ✓ Xorg_xcb_util_keysyms_jll
  13317.7 ms  ✓ Colors
   1238.8 ms  ✓ Xorg_xcb_util_image_jll
   1617.7 ms  ✓ Xorg_xcb_util_renderutil_jll
   1569.0 ms  ✓ Xorg_xcb_util_wm_jll
   1467.9 ms  ✓ Xorg_libXrender_jll
   1341.5 ms  ✓ Xorg_libXext_jll
   1504.9 ms  ✓ Xorg_libXfixes_jll
   1392.8 ms  ✓ Xorg_libxkbfile_jll
   1369.0 ms  ✓ Xorg_xcb_util_cursor_jll
   1283.9 ms  ✓ Xorg_libXinerama_jll
   1600.0 ms  ✓ Xorg_libXrandr_jll
   1803.9 ms  ✓ Libglvnd_jll
   1764.0 ms  ✓ Cairo_jll
   9046.6 ms  ✓ ColorSchemes
   1326.7 ms  ✓ Xorg_libXcursor_jll
   1418.7 ms  ✓ Xorg_libXi_jll
   1252.0 ms  ✓ Xorg_xkbcomp_jll
   2120.2 ms  ✓ HarfBuzz_jll
   1601.1 ms  ✓ Xorg_xkeyboard_config_jll
   1565.6 ms  ✓ libass_jll
   1898.8 ms  ✓ Pango_jll
   1449.1 ms  ✓ xkbcommon_jll
   2878.9 ms  ✓ FFMPEG_jll
   1497.0 ms  ✓ Vulkan_Loader_jll
   1563.3 ms  ✓ libdecor_jll
   1212.1 ms  ✓ FFMPEG
   3831.3 ms  ✓ Qt6Base_jll
   1741.6 ms  ✓ GLFW_jll
   1668.5 ms  ✓ Qt6ShaderTools_jll
   2341.2 ms  ✓ GR_jll
  29804.3 ms  ✓ PlotUtils
   6487.4 ms  ✓ Qt6Declarative_jll
   6982.6 ms  ✓ PlotThemes
  59833.2 ms  ✓ HTTP
   1350.8 ms  ✓ Qt6Wayland_jll
  10193.2 ms  ✓ RecipesPipeline
   6219.5 ms  ✓ GR
  81886.5 ms  ✓ Plots
  121 dependencies successfully precompiled in 203 seconds. 56 already precompiled.
Precompiling packages...
   2491.7 ms  ✓ QuartoNotebookWorkerPlotsExt (serial)
  1 dependency successfully precompiled in 3 seconds

Complexity

Basic Computational Complexity

Big-O Notation

For a function \(f(N)\) and a positive constant \(C\), we say \(f(N)\) is \(\mathrm{O}(g(N))\), if there exist positive constants \(C\) and \(N_0\) such that:

\[ 0 \leq f(N) \leq C \cdot g(N) \quad \text{for all } N \geq N_0 \]

Often crucial to know how problems scale asymptotically (as \(N\to\infty\))
Caution! This is only an asymptotic limit, and can be misleading for small \(N\)
- \(f_1(N) = N^3 + N\) is \(\mathrm{O}(N^3)\)
- \(f_2(N) = 1000 N^2 + 3 N\) is \(\mathrm{O}(N^2)\)
- For roughly \(N>1000\) use \(f_2\) algorithm, otherwise \(f_1\)

Examples of Computational Complexity

Simple examples:
- \(x \cdot y = \sum_{n=1}^N x_n y_n\) is \(\mathrm{O}(N)\) since it requires \(N\) multiplications and additions
- \(A x\) for \(A\in\mathbb{R}^{N\times N},x\in\mathbb{R}^N\) is \(\mathrm{O}(N^2)\) since it requires \(N\) dot products, each \(\mathrm{O}(N)\)

Computational Complexity

Ask yourself whether the following is a computationally expensive operation as the matrix size increases

Multiplying two matrices?
- Answer: It depends. Multiplying two diagonal matrices is trivial.
Solving a linear system of equations?
- Answer: It depends. If the matrix is the identity, the solution is the vector itself.
Finding the eigenvalues of a matrix?
- Answer: It depends. The eigenvalues of a triangular matrix are the diagonal elements.

Numerical Precision

Machine Epsilon

For a given datatype, \(\epsilon\) is defined as \(\epsilon = \min_{\delta > 0} \left\{ \delta : 1 + \delta > 1 \right\}\)

Computers have finite precision. 64-bit typical, but 32-bit on GPUs

machine epsilon for float64 = 2.220446049250313e-16
1 + eps/2 == 1? true
machine epsilon for float32 = 1.1920929e-7

Matrix Structure

A key principle is to ensure you don’t lose “structure”
- e.g. if sparse, operations should keep it sparse if possible
- If triangular, then use appropriate algorithms instead of converting back to a dense matrix
Key structure is:
- Symmetry, diagonal, tridiagonal, banded, sparse, positive-definite
The worse operations for losing structure are matrix multiplication and inversion

Example Losing Sparsity

Here the density increases substantially

A = sprand(10, 10, 0.45)  # random sparse 10x10, 45 percent filled with non-zeros

@show nnz(A)  # counts the number of non-zeros
invA = sparse(inv(Array(A)))  # Julia won't invert sparse, so convert to dense with Array.
@show nnz(invA);

nnz(A) = 46
nnz(invA) = 100

Losing Tridiagonal Structure

An even more extreme example. Tridiagonal has roughly \(3N\) nonzeros. Inverses are dense \(N^2\)

N = 5
A = Tridiagonal([fill(0.1, N - 2); 0.2], fill(0.8, N), [0.2; fill(0.1, N - 2)])
inv(A)

5×5 Matrix{Float64}:
  1.29099      -0.327957     0.0416667  -0.00537634   0.000672043
 -0.163978      1.31183     -0.166667    0.0215054   -0.00268817
  0.0208333    -0.166667     1.29167    -0.166667     0.0208333
 -0.00268817    0.0215054   -0.166667    1.31183     -0.163978
  0.000672043  -0.00537634   0.0416667  -0.327957     1.29099

Forming the Covariance and/or Gram Matrix

Another common example is \(A' A\)

A = sprand(20, 21, 0.3)
@show nnz(A) / 20^2
@show nnz(A' * A) / 21^2;

nnz(A) / 20 ^ 2 = 0.34
nnz(A' * A) / 21 ^ 2 = 0.9229024943310657

Specialized Algorithms

Besides sparsity/storage, the real loss is you miss out on algorithms. For example, lets setup the benchmarking code

using BenchmarkTools
function benchmark_solve(A, b)
    println("A\\b for typeof(A) = $(string(typeof(A)))")
    @btime $A \ $b
end

benchmark_solve (generic function with 1 method)

Compare Dense vs. Sparse vs. Tridiagonal

N = 1000
b = rand(N)
A = Tridiagonal([fill(0.1, N - 2); 0.2], fill(0.8, N), [0.2; fill(0.1, N - 2)])
A_sparse = sparse(A)  # sparse but losing tridiagonal structure
A_dense = Array(A)    # dropping the sparsity structure, dense 1000x1000

# benchmark solution to system A x = b
benchmark_solve(A, b)
benchmark_solve(A_sparse, b)
benchmark_solve(A_dense, b);

A\b for typeof(A) = LinearAlgebra.Tridiagonal{Float64, Vector{Float64}}
  24.456 μs (20 allocations: 47.39 KiB)
A\b for typeof(A) = SparseArrays.SparseMatrixCSC{Float64, Int64}
  521.106 μs (95 allocations: 1.03 MiB)
A\b for typeof(A) = Matrix{Float64}
  18.059 ms (9 allocations: 7.64 MiB)

Triangular Matrices and Back/Forward Substitution

A key example of a better algorithm is for triangular matrices
Upper or lower triangular matrices can be solved in \(\mathrm{O}(N^2)\) instead of \(\mathrm{O}(N^3)\)

b = [1.0, 2.0, 3.0]
U = UpperTriangular([1.0 2.0 3.0;
                     0.0 5.0 6.0;
                     0.0 0.0 9.0])
U \ b

3-element Vector{Float64}:
 0.0
 0.0
 0.3333333333333333

Backwards Substitution Example

\[ \begin{aligned} U x &= b\\ U &\equiv \begin{bmatrix} 3 & 1 \\ 0 & 2 \\ \end{bmatrix}, \quad b = \begin{bmatrix} 7 \\ 2 \\ \end{bmatrix} \end{aligned} \]

Solving bottom row for \(x_2\)

\[ 2 x_2 = 2,\quad x_2 = 1 \]

Move up a row, solving for \(x_1\), substituting for \(x_2\)

\[ 3 x_1 + 1 x_2 = 7,\quad 3 x_1 + 1 \times 1 = 7,\quad x_1 = 2 \]

Generalizes to many rows. For \(L\) it is “forward substitution”

Factorizations

Factorizing matrices

Just like you can factor \(6 = 2 \cdot 3\), you can factor matrices
Unlike integers, you have more choice over the properties of the factors
Many operations (e.g., solving systems of equations, finding eigenvalues, inverting, finding determinants) have a factorization done internally
- Instead you can often just find the factorization and reuse it
Key factorizations: LU, QR, Cholesky, SVD, Schur, Eigenvalue

LU(P) Decompositions

We can “factor” any square \(A\) into \(P A = L U\) for triangular \(L\) and \(U\). Invertible can have \(A = L U\), called the LU decomposition. “P” is for partial-pivoting
Singular matrices may not have full-rank \(L\) or \(U\) matrices

N = 4
A = rand(N, N)
b = rand(N)
# chooses the right factorization based on matrix structure
# LU here
Af = factorize(A)
Af.P * A ≈ Af.L * Af.U

true

Using a Factorization

In Julia the factorization objects typically overload the \ and functions such as inv

@show Af \ b
@show inv(Af) * b

Af \ b = [1.567631093835083, -1.8670423474177864, -0.7020922312927874, 1.0653095651070625]
inv(Af) * b = [1.5676310938350828, -1.8670423474177873, -0.7020922312927873, 1.0653095651070625]

4-element Vector{Float64}:
  1.5676310938350828
 -1.8670423474177873
 -0.7020922312927873
  1.0653095651070625

LU Decompositions and Systems of Equations

Pivoting is typically implied when talking about “LU”
Used in the default solve algorithm (without more structure)
Solving systems of equations with triangular matrices: for \(A x = L U x = b\)
1. Define \(y = U x\)
2. Solve \(L y = b\) for \(y\) and \(U x = y\) for \(x\)
Since both are triangular, process is \(\mathrm{O}(N^2)\) (but LU itself \(\mathrm{O}(N^3)\))
Could be used to find inv
- \(A = L U\) then \(A A^{-1} = I = L U A^{-1} = I\)
- Solve for \(Y\) in \(L Y = I\), then solve \(U A^{-1} = Y\)
Tight connection to textbook Gaussian elimination (including pivoting)

Cholesky

LU is for general invertible matrices, but it doesn’t use positive-definiteness or symmetry
The Cholesky is the right factorization for general positive-definite matrices. For general symmetric matrices you can use Bunch-Kaufman or others
\(A = L L'\) for lower triangular \(L\) or equivalent for upper triangular

N = 500
B = rand(N, N)
A_dense = B' * B  # an easy way to generate a symmetric positive semi-definite matrix
A = Symmetric(A_dense)  # flags the matrix as symmetric
println("A is symmetric? $(issymmetric(A))")

A is symmetric? true

Comparing Cholesky

By default it doesn’t know the matrix is positive-definite, so factorize is the best it can do given symmetric structure

b = rand(N)
factorize(A) |> typeof
cholesky(A) \ b  # use the factorization to solve

benchmark_solve(A, b)
benchmark_solve(A_dense, b)
@btime cholesky($A, check = false) \ $b;

A\b for typeof(A) = LinearAlgebra.Symmetric{Float64, Matrix{Float64}}
  1.970 ms (13 allocations: 2.16 MiB)
A\b for typeof(A) = Matrix{Float64}
  2.747 ms (9 allocations: 1.92 MiB)
  1.500 ms (6 allocations: 1.91 MiB)

Eigen Decomposition

For square, symmetric, non-singular matrix \(A\) factor into

\[ A = Q \Lambda Q^{-1} \]

\(Q\) is a matrix of eigenvectors, \(\Lambda\) is a diagonal matrix of paired eigenvalues
For symmetric matrices, the eigenvectors are orthogonal and \(Q^{-1} Q = Q^T Q = I\) which form an orthonormal basis
Orthogonal matrices can be thought of as rotations without stretching
More general matrices all have a Singular Value Decomposition (SVD)
With symmetric \(A\), an interpretation of \(A x\) is that we can first rotate \(x\) into the \(Q\) basis, then stretch by \(\Lambda\), then rotate back

Calculating the Eigen Decomposition

A = Symmetric(rand(5, 5))  # symmetric matrices have real eigenvectors/eigenvalues
A_eig = eigen(A)
Λ = Diagonal(A_eig.values)
Q = A_eig.vectors
@show norm(Q * Λ * inv(Q) - A)
@show norm(Q * Λ * Q' - A);

norm(Q * Λ * inv(Q) - A) = 5.243912693681636e-15
norm(Q * Λ * Q' - A) = 6.347597591257379e-15

Eigendecompositions and Matrix Powers

Can be used to find \(A^t\) for large \(t\) (e.g. for Markov chains)
- \(P^t\), i.e. \(P \cdot P \cdot \ldots \cdot P\) for \(t\) times
- \(P = Q \Lambda Q^{-1}\) then \(P^t = Q \Lambda^t Q^{-1}\) where \(\Lambda^t\) is just the pointwise power
Related can find matrix exponential \(e^A\) for square matrices
- \(e^A = Q e^\Lambda Q^{-1}\) where \(e^\Lambda\) is just the pointwise exponential
- Useful for solving differential equations, e.g. \(y' = A y\) for \(y(0) = y_0\) is \(y(t) = e^{A t} y_0\)

Large Scale Systems of Equations

Packages that solve BIG problems with “direct methods” include MUMPS, Paradiso, UMFPACK, and many others
Sparse solvers are bread-and-butter scientific computing, so they can crush huge problems, parallelize on a cluster, etc.
But for smaller problems they may not be ideal. Profile and test, and only if you need it.
In Julia, the SciML package LinearSolver.jl is your best bet there, as it lets you swap out backends to profile
On Python harder to flip between them, but scipy has many built in and many wrappers exist. Same with Matlab

Preview of Conditioning

It will turn out that for iterative methods, a different style of algorithm, it is often necessary to multiple by a matrix to transform the problem
The ideal transform would be the matrix’s inverse, which requires a full factorization.
But instead, you can do only part of the way towards the factorization. e.g., part of the way on gaussian elimination
Called “Incomplete Cholesky”, “Incomplete LU”, etc.

Continuous Time Markov Chains

Markov Chains Transitions in in Continuous Time

For a discrete number of states, we cannot have instantaneous transitions between states or it ceases to be measurable
Instead: intensity of switching from state \(i\) to \(j\) as a \(q_{ij}\) where

\[ \mathbb P \{ X(t + \Delta) = j \,|\, X(t) \} = \begin{cases} q_{ij} \Delta + o(\Delta) & i \neq j\\ 1 + q_{ii} \Delta + o(\Delta) & i = j \end{cases} \]

With \(o(\Delta)\) is little-o notation. That is, \(\lim_{\Delta\to 0} o(\Delta)/\Delta = 0\).

Intensity Matrix

\(Q_{ij} = q_{ij}\) for \(i \neq j\) and \(Q_{ii} = -\sum_{j \neq i} q_{ij}\)
Rows sum to 0
For example, consider a counting process

\[ Q = \begin{bmatrix} -0.1 & 0.1 & 0 & 0 & 0 & 0\\ 0.1 &-0.2 & 0.1 & 0 & 0 & 0\\ 0 & 0.1 & -0.2 & 0.1 & 0 & 0\\ 0 & 0 & 0.1 & -0.2 & 0.1 & 0\\ 0 & 0 & 0 & 0.1 & -0.2 & 0.1\\ 0 & 0 & 0 & 0 & 0.1 & -0.1\\ \end{bmatrix} \]

Probability Dynamics

The \(Q\) is the infinitesimal generator of the stochastic process.
Let \(\pi(t) \in \mathbb{R}^N\) with \(\pi_i(t) \equiv \mathbb{P}[X_t = i\,|\,X_0]\)
Then the probability distribution evolution (Fokker-Planck or KFE), is \[ \frac{d}{dt} \pi(t) = \pi(t) Q,\quad \text{ given }\pi(0) \]
Or, often written as \(\frac{d}{dt} \pi(t) = Q^{\top} \cdot \pi(t)\), i.e. in terms of the “adjoint” of the linear operator \(Q\)
A steady state is then a solution to \(Q^{\top} \cdot \bar{\pi} = 0\)
- i.e., the \(\bar{\pi}\) left-eigenvector associated with eigenvalue 0 (i.e. \(\bar{\pi} Q = 0\times \bar{\pi}\))

Setting up a Counting Process

alpha = 0.1
N = 6
Q = Tridiagonal(fill(alpha, N - 1),
                [-alpha; fill(-2*alpha, N - 2); -alpha],
                fill(alpha, N - 1))
Q

6×6 Tridiagonal{Float64, Vector{Float64}}:
 -0.1   0.1    ⋅     ⋅     ⋅     ⋅ 
  0.1  -0.2   0.1    ⋅     ⋅     ⋅ 
   ⋅    0.1  -0.2   0.1    ⋅     ⋅ 
   ⋅     ⋅    0.1  -0.2   0.1    ⋅ 
   ⋅     ⋅     ⋅    0.1  -0.2   0.1
   ⋅     ⋅     ⋅     ⋅    0.1  -0.1

Finding the Stationary Distribution

There will always be at least one eigenvalue of 0, and the corresponding eigenvector is the stationary distribution
Teaser: Do we really need all of the eigenvectors/eigenvalues?

Lambda, vecs = eigen(Array(Q'))
@show Lambda
vecs[:, N] ./ sum(vecs[:, N])

Lambda = [-0.3732050807568874, -0.29999999999999993, -0.19999999999999998, -0.09999999999999995, -0.026794919243112274, 0.0]

6-element Vector{Float64}:
 0.16666666666666657
 0.16666666666666657
 0.1666666666666667
 0.16666666666666682
 0.16666666666666685
 0.16666666666666663

Using the Generator in a Bellman Equation

Let \(r \in \mathbb{R}^N\) be a vector of payoffs in each state, and \(\rho > 0\) a discount rate,
Then we can use the \(Q\) generator as a simple Bellman Equation (using the Kolmogorov Backwards Equation) to find the value \(v\) in each state,

\[ \rho v = r + Q v \]

Rearranging, \((\rho I - Q) v = r\)
Teaser: can we just implement \((\rho I - Q)\cdot v\) and avoid factorizing the matrix?

Implementing the Bellman Equation

rho = 0.05
r = range(0.0, 10.0, length=N)
@show typeof(rho*I - Q)

# solve (rho * I - Q) v = r
v = (rho * I - Q) \ r

typeof(rho * I - Q) = LinearAlgebra.Tridiagonal{Float64, Vector{Float64}}

6-element Vector{Float64}:
  38.15384615384615
  57.23076923076923
  84.92307692307693
 115.07692307692311
 142.76923076923077
 161.84615384615384

Direct Methods and Matrix Factorizations (Julia)

Overview

Motivation

Packages and Materials

Complexity

Basic Computational Complexity

Examples of Computational Complexity

Computational Complexity

Numerical Precision

Matrix Structure

Matrix Structure

Example Losing Sparsity

Losing Tridiagonal Structure

Forming the Covariance and/or Gram Matrix

Specialized Algorithms

Compare Dense vs. Sparse vs. Tridiagonal

Triangular Matrices and Back/Forward Substitution

Backwards Substitution Example

Factorizations

Factorizing matrices

LU(P) Decompositions

Using a Factorization

LU Decompositions and Systems of Equations

Cholesky

Comparing Cholesky

Eigen Decomposition

Calculating the Eigen Decomposition

Eigendecompositions and Matrix Powers

More on Factorizations

Large Scale Systems of Equations

Preview of Conditioning

Continuous Time Markov Chains

Markov Chains Transitions in in Continuous Time

Intensity Matrix

Probability Dynamics

Setting up a Counting Process

Finding the Stationary Distribution

Using the Generator in a Bellman Equation

Implementing the Bellman Equation