NaN errors
NaNs in error term in BiCGstab
In some cases the choice of location and time produces initial conditions of the low resolution nest that leads to an unstable model state, causing a crash with an error similar to the following:
????????????????????????????????????????????????????????????????????????????????
???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!!
? Error code: 1
? Error from routine: EG_BICGSTAB
? Error message: NaNs in error term in BiCGstab after 1 iterations
? This is a common point for the model to fail if it
? has ingested or developed NaNs or infinities
? elsewhere in the code.
? See the following URL for more information:
? https://code.metoffice.gov.uk/trac/um/wiki/KnownUMFailurePoints
? Error from processor: 216
? Error number: 22
????????????????????????????????????????????????????????????????????????????????
If that occurs, two possible work-arounds are:
- Reduce the GAL9 timestep from
300
to150
seconds.
To achieve this, setrg01_rs01_m01_dt
to150
in therose-suite.conf
file.
This does not have a large impact on cycle time, and can usually be reverted in subsequent cycles when the simulation is running without error. - Change the
INITIAL_CYCLE_POINT
in therose-suite.conf
file to an earlier day (usually 1 day should be enough). This can sometimes solve the issue, though at the expense of running the model simulation for longer than was initially desired.
More on this error in the related forum post.