Conjugate Gradient
In this tutorial we describe the basics of using the conjugate gradient iterative solvers At the end you should be able to
- Use Conjugate Gradient to solve a linear system
- Know when to not use it
- Contruct a column-wise linear solver with Conjugate Gradient
What is it?
Conjugate Gradient is an iterative method for solving special kinds of linear systems:
\[ Ax = b\]
via iterative methods.
The linear operator need to be symmetric positive definite and the preconditioner must be symmetric.
See the wikipedia for more details.
Basic Example
First we must load a few things
using ClimateMachine
using ClimateMachine.SystemSolvers
using LinearAlgebra, Random
Next we define a 3x3 symmetric positive definite linear system. (In the ClimateMachine code a symmetric positive definite system could arise from treating diffusion implicitly.)
A = [
2.0 -1.0 0.0
-1.0 2.0 -1.0
0.0 -1.0 2.0
];
We define the matrix A
here as a global variable for convenience later.
We can see that it is symmetric. We can check that it is positive definite by checking the spectrum
eigvals(A)
3-element Array{Float64,1}:
0.585786437626905
1.9999999999999998
3.414213562373095
The linear operators that are passed into the abstract iterative solvers need to be defined as functions that act on vectors. Let us do that with our matrix. We are using function closures for type stability.
function closure_linear_operator!(A)
function linear_operator!(x, y)
mul!(x, A, y)
end
return linear_operator!
end;
We now define our linear operator using the function closure
linear_operator! = closure_linear_operator!(A)
linear_operator! (generic function with 1 method)
We now define our b
in the linear system
b = ones(typeof(1.0), 3);
The exact solution to the system Ax = b
is
x_exact = [1.5, 2.0, 1.5];
Now we can set up the ConjugateGradient struct
linearsolver = ConjugateGradient(b);
and an initial guess for the iterative solver.
x = ones(typeof(1.0), 3);
To solve the linear system we just need to pass to the linearsolve! function
iters = linearsolve!(linear_operator!, nothing, linearsolver, x, b)
2
The variable x
gets overwritten during the linear solve The norm of the error is
norm(x - x_exact) / norm(x_exact)
0.0
The relative norm of the residual is
norm(A * x - b) / norm(b)
0.0
The number of iterations is
iters
2
Conjugate Gradient is guaranteed to converge in 3 iterations with perfect arithmetic in this case.
Non-Example
Conjugate Gradient is not guaranteed to converge with nonsymmetric matrices. Consider
A = [
2.0 -1.0 0.0
0.0 2.0 -1.0
0.0 0.0 2.0
];
We define the matrix A
here as a global variable for convenience later.
We can see that it is not symmetric, but it does have all positive eigenvalues
eigvals(A)
3-element Array{Float64,1}:
2.0
2.0
2.0
The linear operators that are passed into the abstract iterative solvers need to be defined as functions that act on vectors. Let us do that with our matrix. We are using function closures for type stability.
function closure_linear_operator!(A)
function linear_operator!(x, y)
mul!(x, A, y)
end
return linear_operator!
end;
We define the linear operator using our function closure
linear_operator! = closure_linear_operator!(A)
linear_operator! (generic function with 1 method)
We now define our b
in the linear system
b = ones(typeof(1.0), 3);
The exact solution to the system Ax = b
is
x_exact = [0.875, 0.75, 0.5];
Now we can set up the ConjugateGradient struct
linearsolver = ConjugateGradient(b, max_iter = 100);
We also passed in the keyword argument "max_iter" for the maximum number of iterations of the iterative solver. By default it is assumed to be the size of the vector. As before we need to define an initial guess
x = ones(typeof(1.0), 3);
To (not) solve the linear system we just need to pass to the linearsolve! function
iters = linearsolve!(linear_operator!, nothing, linearsolver, x, b)
100
The variable x
gets overwitten during the linear solve The norm of the error is
norm(x - x_exact) / norm(x_exact)
1.0709260403075331
The relative norm of the residual is
norm(A * x - b) / norm(b)
1.3306597079172395
The number of iterations is
iters
100
Conjugate Gradient is guaranteed to converge in 3 iterations with perfect arithmetic for a symmetric positive definite matrix. Here we see that the matrix is not symmetric and it didn't converge even after 100 iterations.
More Complex Example
Here we show how to construct a column-wise iterative solver similar to what is is in the ClimateMachine code. The following is not for the faint of heart. We must first define a linear operator that acts like one in the ClimateMachine
function closure_linear_operator!(A, tup)
function linear_operator!(y, x)
alias_x = reshape(x, tup)
alias_y = reshape(y, tup)
for i6 in 1:tup[6]
for i4 in 1:tup[4]
for i2 in 1:tup[2]
for i1 in 1:tup[1]
tmp = alias_x[i1, i2, :, i4, :, i6][:]
tmp2 = A[i1, i2, i4, i6] * tmp
alias_y[i1, i2, :, i4, :, i6] .=
reshape(tmp2, (tup[3], tup[5]))
end
end
end
end
end
end;
Now that we have this function, we can define a linear system that we will solve columnwise First we define the structure of our array as tup
in a manner that is similar to a stacked brick topology
tup = (3, 4, 7, 2, 20, 2);
where
- tup[1] is the number of Gauss–Lobatto points in the x-direction
- tup[2] is the number of Gauss–Lobatto points in the y-direction
- tup[3] is the number of Gauss–Lobatto points in the z-direction
- tup[4] is the number of states
- tup[5] is the number of elements in the vertical direction
- tup[6] is the number of elements in the other directions
Now we define our linear operator as a random matrix.
Random.seed!(1235);
B = [
randn(tup[3] * tup[5], tup[3] * tup[5])
for i1 in 1:tup[1], i2 in 1:tup[2], i4 in 1:tup[4], i6 in 1:tup[6]
];
columnwise_A = [
B[i1, i2, i4, i6] * B[i1, i2, i4, i6]' + 10I
for i1 in 1:tup[1], i2 in 1:tup[2], i4 in 1:tup[4], i6 in 1:tup[6]
];
columnwise_inv_A = [
inv(columnwise_A[i1, i2, i4, i6])
for i1 in 1:tup[1], i2 in 1:tup[2], i4 in 1:tup[4], i6 in 1:tup[6]
];
columnwise_linear_operator! = closure_linear_operator!(columnwise_A, tup);
columnwise_inverse_linear_operator! =
closure_linear_operator!(columnwise_inv_A, tup);
We define our x
and b
with matrix structures similar to an MPIStateArray
mpi_tup = (tup[1] * tup[2] * tup[3], tup[4], tup[5] * tup[6]);
b = randn(mpi_tup);
x = randn(mpi_tup);
Now we solve the linear system columnwise
linearsolver = ConjugateGradient(
x,
max_iter = tup[3] * tup[5],
dims = (3, 5),
reshape_tuple = tup,
);
The keyword arguments dims is the reduction dimension for the linear solver. In this case dims = (3,5) are the ones associated with a column. The reshape_tuple argument is to convert the shapes of the array x
in the a form that is more easily usable for reductions in the linear solver
Now we can solve it
iters = linearsolve!(columnwise_linear_operator!, nothing, linearsolver, x, b);
x_exact = copy(x);
columnwise_inverse_linear_operator!(x_exact, b);
The norm of the error is
norm(x - x_exact) / norm(x_exact)
6.761055500258119e-14
The number of iterations is
iters
122
The algorithm converges within tup[3]*tup[5] = 140
iterations
Tips
- The convergence criteria should be changed, machine precision is too small and the maximum iterations is often too large
- Use a preconditioner if possible
- Make sure that the linear system really is symmetric and positive-definite
This page was generated using Literate.jl.