Newton's Method

This page documents the nonlinear solver infrastructure used by the implicit stages of IMEX and Rosenbrock methods. For the mathematical background, see the Newton's method formulation.

Newton Solver

NewtonsMethod is the top-level type passed to IMEXAlgorithm. It controls how many iterations to run, when to recompute the Jacobian, whether to use a Krylov linear solver, and optional convergence checking and line search.

ClimaTimeSteppers.NewtonsMethodType
NewtonsMethod(;
    max_iters = 1,
    update_j = UpdateEvery(NewNewtonIteration),
    krylov_method = nothing,
    convergence_checker = nothing,
    verbose = Silent(),
    line_search = false,
)

Solve f(x) = 0 by iterating x[n+1] = x[n] - j(x[n]) \ f(x[n]), where j(x) = f'(x) is the Jacobian. Called via solve_newton!(method, cache, x, f!, j!, prepare_for_f!); x is modified in-place from its initial guess.

Keyword Arguments

  • max_iters: maximum Newton iterations (default 1)
  • update_j: UpdateSignalHandler controlling when the Jacobian is recomputed (see Jacobian update strategies below)
  • krylov_method: a KrylovMethod to solve the linear system approximately. If nothing, uses direct ldiv! (see Krylov variant below)
  • convergence_checker: a ConvergenceChecker that can terminate early based on x[n] and Δx[n]. Without one, always runs max_iters iterations; if convergence has not been reached by max_iters, a warning is printed.
  • verbose: Verbose() to print ‖x‖ and ‖Δx‖ each iteration
  • line_search: if true, applies backtracking (halving up to 5×) when the residual norm does not decrease or becomes NaN

Jacobian update strategies

The update_j parameter accepts any UpdateSignalHandler:

  • UpdateEvery(NewNewtonIteration) — fresh Jacobian every iteration (default)
  • UpdateEvery(NewNewtonSolve) — reuse across iterations within one solve (the chord method: j(x[n]) ≈ j(x₀))
  • UpdateEvery(NewTimeStep) — reuse across solves within a timestep

External signals can also update the Jacobian between solves via update!(method, cache, signal, j!).

Krylov variant

When krylov_method is set, Δx[n] is computed approximately — this is a Newton-Krylov method. If the Krylov method additionally uses a Jacobian-free JVP (see ForwardDiffJVP), neither j_prototype nor j! need to be specified (Jacobian-free Newton-Krylov). When both a JVP and j are provided, j serves as a left preconditioner.

Notes on j_prototype

j_prototype should support ldiv! directly (e.g., a pre-factorized matrix or LinearOperator). Dense matrices are accepted for convenience but trigger an lu factorization on every solve — suitable only for testing. Note that Krylov.jl does not support dense-matrix preconditioners; when using a Jacobian-free JVP, j_prototype must be ldiv!-compatible.

source
ClimaTimeSteppers.LineSearchType
LineSearch(; norm = norm)

Applies a simple backtracking line search to a Newton update.

The line search assumes that x has already been updated by the full Newton step x ← x - Δx, and that f initially contains f(x_old). The full Newton step is evaluated first. If the residual norm does not decrease or becomes non-finite, the step is repeatedly reduced by moving x back toward the previous iterate along the Newton direction. At most five halvings are performed.

The residual norm is measured using norm(f), which can be customized by providing an alternative norm function.

The update is performed in-place and the final (x, f) correspond to the last step tested.

source

Krylov Linear Solver

When the Jacobian is too expensive to form or factor, a Krylov method (e.g. GMRES) can solve the linear system approximately. The accuracy is controlled by a forcing term that sets the Krylov tolerance.

ClimaTimeSteppers.KrylovMethodType
KrylovMethod(;
    type = Val(Krylov.GmresSolver),
    jacobian_free_jvp = nothing,
    forcing_term = ConstantForcing(0),
    args = (20,),
    kwargs = (;),
    solve_kwargs = (;),
    disable_preconditioner = false,
    verbose = Silent(),
    debugger = nothing,
)

Iterative linear solver for Newton's method: finds Δx[n] such that ‖f(x[n]) - j(x[n]) * Δx[n]‖ ≤ rtol[n] * ‖f(x[n])‖, where rtol[n] is controlled by the ForcingTerm. Called via solve_krylov!(method, cache, Δx, x, f!, f, n, prepare_for_f!, j = nothing); Δx is modified in-place. Allocate cache with allocate_cache(method, x_prototype).

This is a wrapper around Krylov.jl solvers. By default, GMRES is used with a Krylov subspace of size 20.

Keyword Arguments

  • type: Krylov solver type, wrapped in Val (default Val(Krylov.GmresSolver))
  • jacobian_free_jvp: a JacobianFreeJVP for matrix-free operation (default nothing → uses j directly)
  • forcing_term: a ForcingTerm setting rtol[n] (default ConstantForcing(0) → exact solve)
  • args, kwargs: forwarded to the Krylov.KrylovSolver constructor
  • solve_kwargs: forwarded to Krylov.solve!
  • disable_preconditioner: if true, skip preconditioning even when j is available (default false)
  • verbose: Verbose() to print the Krylov residual each iteration
  • debugger: a KrylovMethodDebugger run before each Krylov solve

Operator construction

The solver operates on a LinearOperator opj representing j(x[n]):

  • With jacobian_free_jvp: opj evaluates mul!(jΔx, opj, Δx) via the JVP (e.g., finite-difference or AD), so no explicit Jacobian is needed.
  • Without: opj wraps j directly, so mul! reduces to mul!(jΔx, j, Δx).

Preconditioning

When both a jacobian_free_jvp and an explicit j are provided (and disable_preconditioner is false), j is used as a left preconditioner M. The solver calls ldiv!(Δx′, M, f′) (not mul!), so M is treated as an approximation of j rather than as its inverse. If either j or the JVP is missing, or preconditioning is disabled, M = I.

Tolerances

atol is fixed to 0 so the convergence criterion remains purely relative: ‖r‖ ≤ rtol * ‖f‖. A nonzero atol would add a constant floor that prevents the forcing term from driving the residual to zero, breaking the convergence guarantees of the Newton-Krylov method.

Extensibility

All constructor and solver arguments can be overridden via args, kwargs, and solve_kwargs, so any Krylov.jl feature not explicitly covered by this wrapper remains accessible.

source
ClimaTimeSteppers.ConstantForcingType
ConstantForcing(rtol)

A ForcingTerm that returns the fixed value rtol ∈ [0, 1) on every Newton iteration.

Convergence properties

  • rtol > 0: linear convergence with asymptotic rate ≤ rtol
  • rtol = 0: exact Krylov solve → quadratic Newton convergence

Smaller rtol gives faster asymptotic convergence but increases the risk of oversolving (spending Krylov iterations on accuracy that Newton discards).

source
ClimaTimeSteppers.EisenstatWalkerForcingType
EisenstatWalkerForcing(;
    initial_rtol = 0.5,
    γ = 1,
    α = 2,
    min_rtol_threshold = 0.1,
    max_rtol = 0.9,
)

Adaptive ForcingTerm ("Choice 2" from Eisenstat & Walker, 1996) that automatically tightens rtol[n] as ‖f(x[n])‖ decreases, balancing convergence speed against oversolving risk.

Keyword Arguments

  • initial_rtol ∈ [0, 1): tolerance for the first Newton iteration
  • γ ∈ [0, 1]: scaling factor for the tolerance update
  • α ∈ (1, 2]: convergence-order exponent — larger means faster convergence but higher oversolving risk
  • min_rtol_threshold ∈ [0, 1): safeguard against tolerance decreasing too quickly
  • max_rtol ∈ [0, 1): upper bound on rtol[n]

Notes

This is "Choice 2" (not "Choice 1") because it only requires ‖f(x[n])‖ to compute rtol[n], whereas "Choice 1" also needs the final Krylov residual.

source

Jacobian-Free Newton-Krylov (JFNK)

When combined with a Jacobian-free JVP, the Newton-Krylov method never forms or stores the Jacobian matrix — it only needs the action $Jv$, approximated via finite differences.

ClimaTimeSteppers.JacobianFreeJVPType
JacobianFreeJVP

Abstract type for matrix-free Jacobian-vector product strategies. Subtypes compute j(x) * Δx without forming j explicitly, using only function evaluations of f. Called via jvp!(method, cache, jΔx, Δx, x, f!, f, prepare_for_f!); jΔx is modified in-place. Allocate cache with allocate_cache(method, x_prototype).

See ForwardDiffJVP.

source
ClimaTimeSteppers.ForwardDiffJVPType
ForwardDiffJVP(; default_step = ForwardDiffStepSize3(), step_adjustment = 1)

A JacobianFreeJVP that approximates the Jacobian-vector product via first-order forward differences:

`j(x) * Δx ≈ (f(x + ε Δx) - f(x)) / ε`,

where ε = step_adjustment * default_step(Δx, x).

Keyword Arguments

source
ClimaTimeSteppers.ForwardDiffStepSize1Type
ForwardDiffStepSize1()

A ForwardDiffStepSize derived from a truncation-vs-roundoff error analysis of the forward difference approximation j(x) * Δx ≈ (f(x + ε Δx) - f(x)) / ε. Not commonly used with Newton-Krylov methods in practice, but provides intuition for setting step_adjustment in a ForwardDiffJVP.

Reference: Oregon State roundoff/truncation notes.

Result

The optimal step size that minimizes the error upper bound is

`ε = step_adjustment * sqrt(eps(FT)) / ‖Δx‖`,

where step_adjustment = 2 * sqrt(S * R) (default 1). Increase step_adjustment when f is very smooth (S ≫ 1) or has large roundoff (R ≫ 1). For a central difference approximation, the sqrt becomes a cube root (generally, an n-th root for order n - 1).

Derivation

Forward difference error decomposition

The first-order Taylor expansion of f(x + ε Δx) around x is

`f(x + ε Δx) = f(x) + j(x)(ε Δx) + e_trunc(x, ε Δx)`,

where j(x) = f'(x). In floating point we can only evaluate f̂(x), with

`f(x) = f̂(x) + e_round(x)`.

Substituting and rearranging gives the approximation error

`‖error‖ = ‖e_trunc(x, ε Δx) - e_round(x + ε Δx) + e_round(x)‖ / ε`.

Applying the triangle inequality and approximating ‖e_round(x + ε Δx)‖ ≈ ‖e_round(x)‖ for small ε:

`‖error‖ ≤ (‖e_trunc(x, ε Δx)‖ + 2 ‖e_round(x)‖) / ε`.

Bounding truncation error

From Taylor's theorem for multivariate vector-valued functions (proof):

`‖e_trunc(x, ε Δx)‖ ≤ sup_{x̂ ∈ X} ‖f''(x̂)‖ / 2 · ‖ε Δx‖²`.

Defining the smoothness parameter S = ‖f(x)‖ / sup ‖f''(x̂)‖ (default S ≈ 1; larger values indicate a small Hessian relative to f):

`‖e_trunc(x, ε Δx)‖ ≤ ε² / (2S) · ‖Δx‖² · ‖f(x)‖`.

Bounding roundoff error

Assuming componentwise roundoff |e_round(x)[i]| ≤ R · eps(FT) · |f(x)[i]| (default R ≈ 1):

`‖e_round(x)‖ ≤ R · eps(FT) · ‖f(x)‖`.

Optimal step size

Substituting both bounds into the error bound:

`‖error‖ ≤ ε/(2S) · ‖Δx‖² · ‖f(x)‖ + 2R · eps(FT) · ‖f(x)‖ / ε`.

Minimizing over ε (set derivative to zero) gives ε = 2√(SR) · √eps(FT) / ‖Δx‖, i.e., step_adjustment = 2√(SR).

source
ClimaTimeSteppers.ForwardDiffStepSize3Type
ForwardDiffStepSize3()

A ForwardDiffStepSize from Knoll & Keyes, "Jacobian-free Newton–Krylov methods: a survey of approaches and applications". This is the average step size obtained from element-wise forward differences:

`ε = √eps(FT) · Σᵢ(1 + |xᵢ|) / (length(x) · ‖Δx‖)`.

This is the default step size used by ForwardDiffJVP.

source

Convergence Checking

These types control when Newton's method should terminate early (before max_iters) based on the residual or update norm.

ClimaTimeSteppers.ConvergenceCheckerType
ConvergenceChecker(;
    norm_condition,
    component_condition,
    condition_combiner,
    norm = LinearAlgebra.norm,
)

Checks whether a sequence val[0], val[1], val[2], ... has converged to some limit L, given the errors err[iter] = val[iter] .- L. This is done by calling is_converged!(::ConvergenceChecker, cache, val, err, iter), where val = val[iter] and err = err[iter]. If the value of L is not known, err can be an approximation of err[iter]. The cache for a ConvergenceChecker can be obtained with allocate_cache(::ConvergenceChecker, val_prototype), where val_prototype is similar to val and err.

A ConvergenceChecker can perform two types of checks–-it can check whether norm(val) and norm(err) satisfy some ConvergenceCondition, and it can check whether all the components of abs.(val) and abs.(err) individually satisfy some ConvergenceCondition. These two checks can be combined with either & or |. If one of the checks is not needed, the corresponding ConvergenceCondition can be set to nothing.

Instead of LinearAlgebra.norm, norm can be set to anything that will convert val and err to non-negative scalar values.

source
ClimaTimeSteppers.ConvergenceConditionType
ConvergenceCondition

An abstract type for objects that can check whether a sequence of non-negative scalar values val[0], val[1], val[2], ... has converged to some limit L, given the errors err[iter] = |val[iter] - L|.

Every subtype of ConvergenceCondition must define has_converged(::ConvergenceCondition, cache, val, err, iter). The cache, which is set to nothing by default, may be used to store information from previous iterations that is useful for determining convergence. In order to have access to a cache of some particular type, a subtype of ConvergenceCondition should define cache_type(::ConvergenceCondition, ::Type{FT}). To specify on which iterations this cache should be updated, it should define needs_cache_update(::ConvergenceCondition, iter). To specify how the cache should be update on those iterations, it should define updated_cache(::ConvergenceCondition, cache, val, err, iter).

Although cache_type can call promote_type to prevent potential type instability errors, this should be avoided in order to ensure that users write type-stable code.

source
ClimaTimeSteppers.MaximumRelativeErrorType
MaximumRelativeError(max_rel_err)

Checks whether err[iter] ≤ max_rel_err * val[iter]. Since err[iter] ≥ 0 and val[iter] ≥ 0, this can only be true if max_rel_err ≥ 0.

source
ClimaTimeSteppers.MaximumErrorReductionType
MaximumErrorReduction(max_reduction)

Checks whether err[iter] ≤ max_reduction * err[0] for all iter ≥ 1. Since err[iter] ≥ 0, this can only be true if max_reduction ≥ 0. Also, it must be the case that max_reduction ≤ 1 in order for the sequence to not diverge (i.e., to avoid err[iter] > err[0]).

source
ClimaTimeSteppers.MinimumRateOfConvergenceType
MinimumRateOfConvergence(rate, order = 1)

Checks whether err[iter] ≥ rate * err[iter - 1]^order for all iter ≥ 1. Since err[iter] ≥ 0, this can only be true if rate ≥ 0. Also, if order == 1, it must be the case that rate ≤ 1 in order for the sequence to not diverge (i.e., to avoid err[iter] > err[iter - 1]). In addition, if err[iter] < 1 for all sufficiently large values of iter, it must be the case that order ≥ 1 for the sequence to not diverge.

source

Jacobian Update Signals

The update signal system controls when the Jacobian is recomputed. Freezing the Jacobian across iterations (chord method) or across timesteps reduces cost at the expense of convergence rate.

ClimaTimeSteppers.UpdateSignalHandlerType
UpdateSignalHandler

A boolean indicating if updates a value upon receiving an appropriate UpdateSignal. This is done by calling needs_update!(::UpdateSignalHandler, ::UpdateSignal).

source
ClimaTimeSteppers.UpdateEveryType
UpdateEvery(update_signal_type)

An UpdateSignalHandler that performs the update whenever it is needs_update! with an UpdateSignal of type update_signal_type.

source
ClimaTimeSteppers.UpdateEveryNType
UpdateEveryN(n, update_signal_type, reset_signal_type = Nothing)

An UpdateSignalHandler that performs the update every n-th time it is needs_update! with an UpdateSignal of type update_signal_type. If reset_signal_type is specified, then the counter (which gets incremented from 0 to n and then gets reset to 0 when it is time to perform another update) is reset to 0 whenever the signal handler is needs_update! with an UpdateSignal of type reset_signal_type.

source
ClimaTimeSteppers.UpdateEveryDtType
UpdateEveryDt(dt)

An UpdateSignalHandler that performs the update whenever it is needs_update! with an UpdateSignal of type NewTimeStep and the difference between the current time and the previous update time is no less than dt.

source

Debugging

ClimaTimeSteppers.PrintConditionNumberType
PrintConditionNumber()

A KrylovMethodDebugger that prints cond(j) and, when a preconditioner M is available, cond(M⁻¹ j) (the effective condition number seen by the Krylov solver).

Warning

This computes dense representations of j and M⁻¹ j, which is much slower than the Krylov solve itself. Use only for debugging.

source