Online Simulation Progress Reporting

The OnlineLogging module provides tools to monitor and report the simulation progress and other information of simulations.

Currently, the only feature implemented is timing report, to print information about current step, simulation time, and average performance. With report_walltime, you will see progress information like the following

┌ Info: Progress
│   simulation_time = "2 seconds"
│   n_steps_completed = 20
│   wall_time_per_step = "10 milliseconds, 100 microseconds"
│   wall_time_total = "1 second, 10 milliseconds"
│   wall_time_remaining = "808 milliseconds, 35 microseconds"
│   wall_time_spent = "202 milliseconds, 8 microseconds"
│   percent_complete = "20.0%"
│   estimated_sypd = "0.027"
│   date_now = 2024-12-03T16:01:13.660
└   estimated_finish_date = 2024-12-03T16:01:14.660

WallTimeInfo and report_walltime

WallTimeInfo is a struct that holds information about wall time (the time you see on your watch, not the simulation time) and that can be used to report timing information with report_walltime.

WallTimeInfo keeps track and accumulates how much time has elapsed since the last time it was updated. In this, WallTimeInfo tries to automatically remove the compilation overhead that your simulation might run into in the first step (this is accomplished by ignoring the first step and doubling the cost of the second step to compensate).

The simplest way to use WallTimeInfo is to make report_walltime a callback. Here is an example using a SciML integrator

import SciMLBase
import ClimaUtilities.OnlineLogging: WallTimeInfo, report_walltime
# Create the WallTimeInfo
wt = WallTimeInfo()

# Define a schedule that defines how often to report. Here, we follow the signature
# required by SciML. This function is triggered every 10 steps
function every10steps(u, t, integrator)
    return mod(integrator.step, 10) == 0
end

# Next, define the callback
report_callback = SciMLBase.DiscreteCallback(every10steps,
    let wt = wt
        integrator -> report_walltime(wt, integrator)
    end)

# The let wt = wt is not strickly required, but it can improve type-stability and performance

# Now that we have the callback, we can pass it to the SciML constructor for the integrator
Todo

Describe schedules when we add them to ClimaUtilities

The report_walltime function prints various timing statistics:

  • simulation_time: The elapsed time within the simulation.
  • n_steps_completed: The number of simulation steps completed.
  • wall_time_per_step: The average wall clock time per step.
  • wall_time_total: Estimated total wall clock time for the simulation.
  • wall_time_remaining: Estimated remaining wall clock time.
  • wall_time_spent: Total wall clock time already spent.
  • percent_complete: Percentage of the simulation completed.
  • estimated_sypd: Estimated simulated years per day (or days per day if the rate is slow).
  • date_now: The current date and time.
  • estimated_finish_date: The estimated date and time of completion.
Note

The estimated values (like wall_time_remaining and estimated_sypd) are based on the average wall time per step and can fluctuate, especially early in the simulation. They become more reliable as the simulation progresses.

API

ClimaUtilities.OnlineLogging.report_walltimeFunction
report_walltime(wt::WallTimeInfo, integrator)

Report the current progress and estimated completion time of a simulation.

This function calculates and displays various timing statistics based on the provided WallTimeInfo (wt) and the integrator state. It estimates the remaining wall time, total simulation time, and simulated time per real-time unit.

Prints a summary of the simulation progress to the console, including:

  • simulation_time: The current simulated time.
  • n_steps_completed: The number of completed steps.
  • wall_time_per_step: Average wall time per simulation step. You should expect this to be unreliable until the number of completed steps is large.
  • wall_time_total: Estimated total wall time for the simulation. You should expect this to be unreliable until the number of completed steps is large.
  • wall_time_remaining: Estimated remaining wall time. You should expect this to be unreliable until the number of completed steps is large.
  • wall_time_spent: Total wall time spent so far.
  • percent_complete: Percentage of the simulation completed.
  • estimated_sypd: Estimated simulated years per day (or simulated days per day if sypd is very small). You should expect this to be unreliable until the number of completed steps is large.
  • date_now: The current date and time.
  • estimated_finish_date: The estimated date and time of simulation completion. You should expect this to be unreliable until the number of completed steps is large.
Note

Average quantities and simulated-years-per-day are computed by taking the total time elapsed (minus initial compilation) and dividing by the number of steps completed. You should expect them to fluctuate heavily and to be unreliable until the number of steps become large. "Large" is defined by your problem: for example, the code has to go through all the callbacks and diagnostics a few times before stabilizing (and this is different for different simulations).

Arguments:

  • wt::WallTimeInfo: A struct containing wall time information.
  • integrator: The integrator object containing simulation state information, including the current time t, timestep dt. It also have to have time span tspan in integrator.sol.prob.tspan.

How to use report_walltime

report_walltime is intended to be used as a callback executed at the end of a simulation step. The callback can be called with an arbitrary schedule, so that reporting can be customized.

Example

Suppose we want to report progress every 10 steps in a SciMLBase-type of integrator.

import ClimaUtilities.OnlineLogging: WallTimeInfo, report_progress
import SciMLBase

# Prepare the WallTimeInfo
walltime_info = WallTimeInfo()

# Define schedule, a boolean function that takes the integrator
every10steps(u, t, integrator) = mod(integrator.step, 10) == 0

# Define the callback, we use `let` to make this a little faster
report = let wt = walltime_info
     (integrator) -> report_progress(wt, integrator)
end
report_callback = SciMLBase.DiscreteCallback(every10steps, report)

# Then, we can attach this callback to the integrator

TODO: Discuss/link Schedules when we move them to ClimaUtilities.

source