Calibration is the process of adjusting model parameter values to obtain a closer fit between observed and simulated variables. The issue is contentious from two main perspectives. First, some may say that if the model has been properly constructed to adequately represent the processes being simulated and the parameters have been properly measured and applied, then calibration should not be necessary. Allied to this is a further notion that calibration is the act of finding the “fiddle factors” that force a model to be correct when, in fact, it isn’t and perhaps either the wrong model was chosen for the circumstances or the model needs instead to be rethought.
The issue is thorny because there are very few industry standard guidelines for model calibration. And then, how does a modeler know when the calibration is good enough and the model fit-for-use? For calibration to be possible, there must be data. This is often part of the problem as the observed data for variables being simulated may be sparse (spatially and temporally), may have questionable accuracy (interpolated?), may not be continuous, may even be just inaccessible (bureaucracy, confidentiality), or deemed too expensive to collect. Without adequate data on which to base a calibration, applying simulation models can become an act of faith. So, even the simplest of simulation models need some form of calibration. There are a number of reasons why simulation outputs do not fit observed data and, hence, the justifiable need for calibration:
Models are only approximations of reality and are therefore unlikely to produce a perfect fit. Models are often expensive to create and test, new software is expensive to buy and maintain, so models have a considerable “life span” and get reused on many studies where conditions, due to natural variation, are unlikely to match those for which the model was originally constructed and tested; users also have an inertia in changing from a familiar model to one in which they are less familiar.
- Models may get reused at scales for which they were not originally constructed and tested.
- Data upon which parameter estimation is based are unlikely to be error free and may in addition have operational uncertainty induced either through GIS handling and/or in the simulation.
- Observations used in calibration are unlikely to be error free.
- Field data on, for example, soil strengths or infiltration rates represent only the micro properties of the materials measured over areas of less than 1 m2 and may not represent the meso or macro behavior of the materials.
- The initial state of the system and boundary conditions are thus unlikely to be known precisely and even the most carefully determined parameters are likely to be best estimates.
- The processes being modeled may have chaotic tendencies such that what appear to be the same inputs (within our ability to specify and measure) may not consistently produce the same response (but, this would be an indication that a stochastic rather than a deterministic model might be preferable).
- The processes may exhibit nonstationarity where the relationship over time between inputs and outputs may change in response to other evolving changes in the system.
Taking all into consideration, the odds are that a simulation will initially not fit the observed data well; in fact, so much so that a good fit would indeed be a surprise that might even turn to suspicion. So, why is the act of calibration so problematic? As we shall see, good model calibration is as much an art as it is a science, perhaps more so since you need a “feel” for the model, the processes being modeled, and the situation in which it is being applied—a “feel” that is both intuitive and inspirational. However, first there needs to be some measure of fit between simulated and observed and these are usually based on the empirical variance of residual errors.
Similar to a correlation coefficient, E = 1 indicates a perfect fit. This may not be the most suitable way of calculating goodness-of-fit for some types of models where small time lags can dramatically increase σe 2 for oscillating variables or where the residuals are strongly autocorrelated in space and/or time (for a comparison of a range of goodness-of-fit statistics, see, for example, Fotheringham and Knudsen, 1987). Then a choice needs to be made about which parameters might usefully be varied to produce the desired changes toward a better goodness-of-fit. A local sensitivity analysis, such as OAT (one-at-a-time) might allow the sensitivity of the model to individual parameters to be quantified and ranked and also allow an observation to be made as to how each parameter influences model behavior. One measure of sensitivity Si is where xi = value of a parameter i, xib = baseline value of parameter i, v = value of the simulated variable, vb = baseline value of the simulated variable.
If all parameters are varied by the same amount (e.g., ±5%), then the relative importance of parameters can be evaluated. But what sounds simple quickly gets quite complicated. First of all, assessing what contribution a particular changed parameter has had on the overall outcome of the simulation can be difficult. This is because many environmental simulation models are nonlinear. If they were linear, then inputs would have linear relationship to outputs. But, this is rarely the case. In hydrology, for example, antecedent conditions have an important role in determining the relationship between inputs and outputs for any one storm. This produces a nonlinear relationship between inputs and outputs and because antecedent conditions vary (and may not be well measured), a unit of input may not consistently produce the same unit of output for all time steps. Another cause of nonlinearity is nonstationarity, which was already mentioned above.
Second, when GIS are coupled with simulation modeling, there is the added consideration of what preprocessing took place to establish parameter values and what postprocessing might have taken place to reach the calibration stage. If there were algorithm choices (see above), then for a proper evaluation, some different approaches may need to be tried and then tested for sensitivity as in Equation. Third, such a strategy may be tractable for a small number of parameters, but where there are a large numbers of parameters such sensitivity analyses could take weeks or even months.
Such a strategy is based on the premise that there is a single global optimum in the parameter space that can be found by varying the parameters for which the model is most sensitive such that a best fit with observations can be found. If this is the case and the number of parameters are few, then all well and good. But, unfortunately, as summarized by Beven (2001), there may not be a single global optimum, but a series of local optima instead. This would mean having to accept that there is equifinality in the solution, that is, there may be a number of model states that are acceptably consistent with the observed behavior of the processes being modeled. Therefore, it may be better to think of corroboration in which there is a noncontradiction between the output of a model and the evidence from reality.