Elliptic solvers

The elliptic problem for the pressure

The 3D non-hydrostatic pressure field is obtained by taking the divergence of the horizontal component of the momentum equations and invoking the vertical component to yield an elliptic Poisson equation for the non-hydrostatic kinematic pressure

along with homogenous Neumann boundary conditions (Neumann on for wall-bounded directions and periodic otherwise) and where denotes the source term for the Poisson equation.

Dividing small timesteps

In practice, in order to avoid division by extremely small numbers when , we solve the Poisson equation for instead. is then removed from the pressure field after the elliptic solve routine.

Hydrostatic approximation

For problems in which the hydrostatic approximation is invoked, the Poisson equation for pressure above only needs to be solved for the vertically integrated flow and the pressure field is a two dimensional term .

Direct method

Discretizing elliptic problems that can be solved via a classical separation-of-variables approach, such as Poisson's equation, results in a linear system of equations where is a real symmetric matrix of block tridiagonal form. This allows for the matrix to be decomposed and solved efficiently, provided that the eigenvalues and eigenvectors of the blocks are known (§2) (Buzbee et al., 1970). In the case of Poisson's equation on a rectangle, Hockney (1965) has taken advantage of the fact that the fast Fourier transform can be used to perform the matrix multiplication steps resulting in an even more efficient method. Schumann and Sweet (1988) describe the implementation of such an algorithm for Poisson's equation on a staggered grid with Dirichlet, Neumann, and periodic boundary conditions.

The method can be explained easily by taking the Fourier transform of both sides of to yield

where denotes the Fourier component. Here , , and are the wavenumbers. However, when solving the equation on a staggered grid we require a solution for that is second-order accurate such that when when its Laplacian is computed, matches to machine precision. This is crucial to ensure that the projection step in the fractional time-step works (see Time-stepping section and Fractional step method appendix). To do this, the wavenumbers are replaced by eigenvalues , , and satisfying the discrete form of Poisson's equation with appropriate boundary conditions. Thus, Poisson's equation is diagonalized in Fourier space and the Fourier coefficients of the solution are easily solved for

The eigenvalues are given by Schumann and Sweet (1988) and can also be tediously derived by plugging in the definition of the discrete Fourier transform into :

where and correspond to periodic boundary conditions in the horizontal and to Neumann boundary conditions in the vertical.

There is also an ambiguity in the solution to Poisson's equation as it's only defined up to a constant. To resolve this ambiguity we choose the solution with zero mean by setting the zeroth Fourier coefficient (corresponding to ) to zero. This also has the added benefit of discarding the zero eigenvalue so we don't divide by it.

The Fast Fourier transforms are computed using FFTW.jl [(Frigo and Johnson, 1998) and (Frigo and Johnson, 2005)] on the CPU and using the cuFFT library on the GPU. Along wall-bounded dimensions, the cosine transform is used. In particular, as the transforms are performed on a staggered grid, DCT-II (REDFT10) is used to perform the forward cosine transform and DCT-III (REDFT01) is used to perform the inverse cosine transform.

Direct method with a vertically stretched grid

Using Fourier transforms for all three dimensions results in a method requiring operations where is the total number of grid points. This algorithm can be made even more efficient by solving a tridiagonal system along one of the dimensions and utilizing cyclic reduction. This results in the Fourier analysis cyclic reduction or algorithm (with cyclic reduction steps) which requires only operations provided the optimal number of cyclic reduction steps is taken, which is where is the number of grid points in the cyclic reduction dimension. The FACR algorithm was first developed by Hockney (1969) and is well reviewed by Swarztrauber (1977) then further benchmarked and extended by Temperton (1979) and Temperton (1980).

Furthermore, the FACR algorithm removes the restriction that the grid is uniform in one of the dimensions so it can be utilized to implement a fast Poisson solver for vertically stretched grids if the cyclic reduction is applied in the along the vertical dimension.

Expanding and into Fourier modes along the and directions

and recalling that Fourier transforms do and we can write as

Discretizing the derivative and equating the term inside the brackets to zero we arrive at symmetric tridiagonal systems of linear equations for the Fourier modes:

Cosine transforms on the GPU

Unfortunately cuFFT does not provide cosine transforms and so we must write our own fast cosine transforms for the GPU. We implemented the fast 1D and 2D cosine transforms described by Makhoul (1980) which compute it by applying the regular Fourier transform to a permuted version of the array.

In this section we will be using the DCT-II as the definition of the forward cosine transform for a real signal of length

and the DCT-III as the definition of the inverse cosine transform

and will use to denote the root of unity, sometimes called the twiddle factors in the context of FFT algorithms.

1D fast cosine transform

To calculate using the fast Fourier transform, we first permute the input signal along the appropriate dimension by ordering the odd elements first followed by the even elements to produce a permuted signal

where indicates the integer part of . This should produce, for example,

after which is computed using

1D fast inverse cosine transform

The inverse can be computed using

after which the inverse permutation of must be applied.

2D fast cosine transform

Unfortunately, the 1D algorithm cannot be applied dimension-wise so the 2D algorithm is more complicated. Thankfully, the permutation can be applied dimension-wise. The forward cosine transform for a real signal of length is then given by

where and indicates that is indexed in reverse.

2D fast inverse cosine transform

The inverse can be computed using

where here, is indexed in reverse along the first dimension, along the second dimension, and along both. and are masks of lengths and respectively, both containing ones except at the first element where . Afterwards, the inverse permutation of must be applied.

Due to the extra steps involved in calculating the cosine transform in 2D, running with two wall-bounded dimensions typically slows the model down by a factor of 2. Switching to the FACR algorithm may help here as a 2D cosine transform won't be necessary anymore.

Iterative Solvers

For problems with irregular grids the eigenvectors of the discrete Poisson operator are no longer simple Fourier series sines and cosines. This means discrete Fast Fourier Transforms can't be used to generate the projection of the equation right hand side onto eigenvectors. So an eigenvector based approach to solving the Poisson equation is not computationally efficient.

For problems with grids that are non uniform in multiple directions, we use instead a pre-conditioned conjugate gradient iterative solver. Such cases include curvilinear grids on the sphere and also telescoping cartesian grids that stretch along more than one dimension. There are two forms of the pressure operator in this approach. One is rigid lid form and one is an implicit free-surface form.

Rigid lid pressure operator

The rigid lid operator is based on the same continuous form as is used in the Direct Method solver.

Implicit free surface pressure operator

The implicit free surface solver solves for the free-surface, , in the vertically integrated continuity equation:

where is the depth of the water column (to first order with respect to the free surface elevation) and is some surface volume flux (e.g., terms such as precipitation, evaporation and runoff); currently Oceananigans.jl assumes . Note that in deriving , we used the bottom boundary condition .

To form a linear system that can be solved implicitly we recast the vertically-integrated continuity equation into a discrete integral form. The best way to do so is by starting from the discrete version of the continuity equation (in this case without any surface volume flux, )

and summing it vertically to get:

In equations and and here after, we have abused notation and used, e.g., and to denote the volume averages over grid cells of the quantities and respectively. Using and being a bit more explicit on the locations the difference operators act on, becomes:

We can now apply the velocity fractional step equation (discussed in the Time-stepping section) for the hydrostatic model:

We impose that the -th time step velocity is consistent with

Substituting and from the discrete form of the right-hand-side of then gives us an implicit equation for :

The left-hand-side of is nothing else than a linear operator acting on . Formulated in this way, the linear operator is symmetric and therefore can be solved using a preconditioned conjugate gradient algorithm.

`NonhydrostaticModel`

`HydrostaticFreeSurfaceModel`

`ShallowWaterModel`

Elliptic solvers

The elliptic problem for the pressure

Direct method

Direct method with a vertically stretched grid

Cosine transforms on the GPU

1D fast cosine transform

1D fast inverse cosine transform

2D fast cosine transform

2D fast inverse cosine transform

Iterative Solvers

Rigid lid pressure operator

Implicit free surface pressure operator

Elliptic solvers ​

The elliptic problem for the pressure ​

Direct method ​

Direct method with a vertically stretched grid ​

Cosine transforms on the GPU ​

1D fast cosine transform ​

1D fast inverse cosine transform ​

2D fast cosine transform ​

2D fast inverse cosine transform ​

Iterative Solvers ​

Rigid lid pressure operator ​

Implicit free surface pressure operator ​

Elliptic solvers

The elliptic problem for the pressure

Direct method

Direct method with a vertically stretched grid

Cosine transforms on the GPU

1D fast cosine transform

1D fast inverse cosine transform

2D fast cosine transform

2D fast inverse cosine transform

Iterative Solvers

Rigid lid pressure operator

Implicit free surface pressure operator