The Technical Brief by Roache [(1)] presents ten items of discussion of our factor of safety (FS) method for solution verification [(2)]. Our responses are listed below item-by-item using the same numbering as Roache. The nomenclature mostly follows our own and not Roache’s such as for the order of accuracy calculated using the Richardson Extrapolation as opposed to the observed order of convergence and the GCI and GCI2 methods as opposed to the GCI0 and the real GCI methods. However, we agree with Roache to use for the factor of safety used in all the verification methods. In response to item (10), we have used our approach to evaluate two new variants of the GCI method and one new variant of the FS method.
There are a few variants of the GCI method. We have used the definition of the GCI method, which arguably is the most common version/interpretation applied in the literature [(3,4,5)]. The GCI1 method was proposed by Logan and Nitta [(6)]. The guideline for the GCI2 method was communicated to us by Roache
The private communication is available to the public upon request.
Our purpose is not to add to this confusion but rather to evaluate the performance of the outcomes of selecting any of these variants of the GCI method and compare with the FS method using our approach. The correction factor and are used to define the GCI2 and other verification methods as defined by Eqs. (10)–(15) in Ref.  in order to compare their relative conservativeness using the same error estimate .
(2) We disagree with Roache to refer to the GCI as the GCI0 method and the GCI2 as the GCI method for reasons given in item (1). The lack of a single guideline for selecting and and when to use which variant of the GCI method is highlighted by Roache’s current discussion. Roache accepts using when it is within a 5% difference of in item (2), whereas later in item (10), Roache considers two other judgment calls as reasonable.
The GCI2 method discards the “coarse” grid solution in the uncertainty estimate when , which is difficult to justify. For example, four grid solutions from the coarsest grid 4 to the finest grid 1 can build two grid triplet studies, (1, 2, 3) and (2, 3, 4). Grid convergence studies for industrial applications often show the oscillation of such that (1, 2, 3) could estimate but (2, 3, 4) could estimate . Based on the GCI2 method, should be discarded in the uncertainty estimate for (1, 2, 3) but not for (2, 3, 4). Of course, we agree that ideally one would conduct additional grid triplet studies until the solution is at or as close as possible to the asymptotic range; however, clearly this is not always possible especially for industrial applications [(10)].
We agree that a grid-triplet study with is not desirable. However, it is not uncommon for solution verification studies (e.g., local ranges from 0.012 to 8.47 in Ref. ). Additionally, Roache’s criticism of using is inconsistent with one of his previous conclusions that there is no necessity to discard results with ( for a second order method in Ref. ).
(3) The fact that “the use of the GCI1 method is closer to a 68% than a 95% confidence level” was one of the conclusions by Logan and Nitta [(6)]. This conclusion was not just based on the dataset with intentional choice of grid studies with oscillations in both exponent p and output quantity. As stated in page 367 in Ref. , “However, for our contrived and mechanics example sets (most of which were non-smooth), the use of GCI = 1.25 is much closer to a 68% confidence estimate than 95%.” It was also recommended in Refs.  and  that a sample with the number of grid convergence studies much larger than 100 is needed to draw general conclusions.
We did not recommend the GCI1 method but rather evaluated it using much larger sample sizes than Ref. . For the largest sample 3 with size , the reliability (Eq. (19) in Ref. ) is 90.3% for the GCI1 method.
(4) We disagree with Roache’s evaluation in Ref.  where it states that “Briefly, the net result is 14 NC (nonconservative) of 176 entries, or 8.0%.” Only 151 of the 176 grid triplet studies have the actual error . This results in 24 nonconservative of 151 (note there are nine nonconservative grid-triplet studies that estimate ). So, the reliability for the GCI method [(12)] is actually 84.1%, which agrees very well with the reliability 83.9% estimated using our 329 grid-triplet studies (sample 3 in Ref. ).
Based on our own evaluation above and the fact that Cadafalch et al. [(12)] used for , the method they applied was not the GCI2 method and more likely the GCI method. The claim of “an original and reasonable variant of the real GCI” [(1)] again is confusing.
(5) We take 95% coverage as the common uncertainty target for both experiments and computations [(5)]. Although the GCI2 method only misses the overall reliability by 0.8% for sample 3, more importantly it fails to provide sufficient conservatism for other samples including the reliabilities of 91.4%, 90%, and 87.5% for samples 5, 8, and 16, respectively [(2)]. It is possible that another dataset could slightly change our evaluations. Nonetheless, the current sample size is large and the range of values is wide such that a further increase of the number of samples is not likely to significantly alter the FS method and its results.
(6) The FS method was calibrated/validated against the available dataset. Note that calibration/validation requires that the true error can be evaluated, i.e., the solution numerical benchmark () or solution analytical benchmark () is known. We welcome additional validation of the FS method and if necessary re-calibration and improvement, but again or must be known. The claim of Roache and others of the 95% reliability for the GCI method is undocumented and based on anecdotal information. We doubt that or is available for many of the cases cited by Roache and others. It should be a simple matter to provide proper documentation.
Note that the FS method is more conservative than the GCI2 method except for due to the jump of the factor of safety at for the GCI2 method. If the FS method is not conservative enough for another dataset, the GCI2 method will likely be worse.
The claim that the GCI2 method has been stable for over 12 years is not well founded. Due to the lack of a single guideline on the choice of and , different variants of the GCI method have been used by different users based on their own judgment calls. For example, Cadafalch et al. [(12)] did not use the GCI2 method, and Logan and Nitta [(6)] used the GCI1 method. Furthermore, the GCI method may have been applied to O(1000) cases but no statistical evidence for reliability has been documented.
(7) We disagree with Roache’s suggestion that the FS method has problems in predicting monotonic convergence for fine grids. The uncertainty estimates in Table 6 for the FS method in Ref.  for the three finest grid triplets are not monotonically decreasing since shows large oscillations, and the factor of safety for the second finest grid triplet (2, 3, 4) at is much larger than that for the other methods evaluated at the same P. However, the larger factor of safety is required to ensure the reliability for . For the three grid triplets discussed, it is interesting to evaluate the convergence ratio for the fine grid solution (), (), and (). All the five verification methods have the same and , which show monotonic convergence. The GCI, GCI1, and CF methods show monotonic convergence for UG, whereas the GCI2 and FS methods show monotonic divergence () and oscillatory divergence (), respectively.
The oscillation of P may be caused by many factors. Grid 4 is still too coarse for the solution to be in the asymptotic range. Additionally, reducing the iterative error to machine zero is very difficult for large-scale computations. With the small grid refinement ratio , solution changes ɛ will be small, and the sensitivity to grid-spacing and time step may be difficult to identify compared with iterative errors . As shown in Fig. 6(b) in Ref. , for the cases in Table 6 [(2)]. When increases, will likely decrease. For example, the grid uncertainty decreases from 5.04 for (2, 4, 6) to 4.02 for (1, 3, 5) with for . However, it should be noted that a large may be problematic, too, as different grids may resolve different flow physics.
There are some other cases that the GCI, GCI1, GCI2, CF, and FS methods show non-monotonic convergence for multiple grid-triplet studies, including the “well-behaved” problems Cadafalch et al. [(12)] and Roache [(11)] used to evaluate the conservativeness of the GCI method. For the radial velocity using the SMART scheme in the study of premixed methane/air laminar flat flame on a perforated burner [(13,14,15)], the uncertainty estimates using the FS and GCI2 methods monotonically decreased whereas the other three methods did not as the grid is refined. Another example is for the uncertainty estimates for temperature at a monitored location for a two-dimensional natural convection in square cavities at , which had five grid-triplet studies with [(16)]. Uncertainty estimates using the five verification methods discussed in Ref.  first monotonically decreased as the grid is refined but suddenly increased for the finest grid-triplet. Thus, it is unreasonable to blame the FS method as the reason for such behavior.
The verification results for our industrial application example are far from the asymptotic range. Although we evaluated the convergence characteristics for the 98 verification variables using and as functions of [(2)], a standard criterion for achieving the asymptotic range is still lacking. A possible criterion is that monotonic convergence should be established based on evaluation of the convergence ratio for fine grid solution (towards ), (towards 1), and (monotonically decreasing) for multiple (at least three) grid-triplets with the same grid refinement ratio and . In some cases, oscillatory convergence may be acceptable; however, this would require many grid triplets [(17)]. Although still needs to be evaluated for all the variables in our dataset, 41.5% of the variables that have more than two grid-triplet studies do show that approaches , approaches 1, and monotonically decreases as the grid is refined. For the other 58.5% of the variables, also approaches as shown by monotonically decreased error magnitude , but and often show mixed convergence conditions as the grid is refined.
(8) As discussed in item (6), without statistical evidence, the claim of the conservativeness of the GCI2 method is undocumented. Furthermore, we doubt very much how many applications have or . If they do, we will be glad to add them to our dataset. The work by Dr. C. J. Freitas and his group is not publicly available [(9)]. Therefore, the claim of achieving the 95% reliability is again undocumented and based on anecdotal information.
(9) We agree that the actual factor of safety is undefined when a solution not in the asymptotic range happens to predict the true value. If this happens, it should be excluded from the dataset used to derive the FS method. However, monotonic convergence ensures that the uncertainty estimate is always greater than zero so that a zero error will be automatically bounded by the uncertainty.
The contrived example created by Roache only proves that the average actual factor of safety () cannot be used alone to determine if a solution verification method is conservative enough. But it can be used to determine the relative conservativeness between different verification methods.
It should be noted that we used both the reliability and LCL as defined by Eq. (22) in Ref.  to develop the FS method and determine if a method is conservative enough. Larger does not necessarily mean larger (readers can refer to sample 6 in Ref. ).
The GCIOR, GCI3, and FS1 methods are evaluated using statistical analysis of the 25 samples following Ref. , with focus on samples 3 to 25. Table 1 shows the statistics for samples 3 to 8 [(2)] based on six different ranges for the three new methods. The FS1 method has the same reliability as the FS method for samples 3 to 8. The GCIOR and GCI3 methods almost have the same reliability, but the GCI3 method is a little more conservative. Compared to the GCI2 method, the GCIOR and GCI3 methods improve the reliability for to be larger than 95% but are not conservative enough for , especially near the asymptotic range. Examination of 18.2% of the data for , which cover samples 7 and 8, shows that only the FS and FS1 methods achieve 95% reliability, but the GCIOR and GCI3 methods achieve only 90%. The largest for samples 3-5, sample 6, sample 7, and sample 8 are the GCI3, GCI2, FS, and FS1 methods, respectively. For all the verification methods, the LCLs are larger than 1.2 for all the ranges.
Table 2 shows the statistics at the seventeen values (samples 9 to 25) ranging from 0.705 to 1.205. For samples 9 to 19 (), all the verification methods achieve reliabilities larger than 95% except 93.1% for the GCIOR method at , 87.5% for the three GCI methods at , and 84.6% for the GCIOR and GCI3 methods at . The largest for samples 9-12, samples 13-20, samples 21-24, and sample 25 are the GCIOR and GCI3, FS and FS1, GCI2, and FS1 methods, respectively. Only the FS and FS1 methods satisfy the requirement that for samples 9 to 25. The GCI2 method has for sample 20; the GCIOR method has for samples 13, 16, 17, 18, 20, and 22; and the GCI3 method has for samples 20 and 22.
The actual factor of safety for sample 3, sample 3 averaged using , and the upper and lower band of the confidence interval for samples 9 to 25 are shown in Fig. 2. t is the factor for the student-t distribution and is the standard deviation of the mean of the sample, as defined in Ref. . The GCIOR and GCI3 methods do not satisfy near the asymptotic range. Compared to the FS method (Fig. 4(e) in Ref. ), the FS1 method shows a larger actual factor of safety when solutions are farther from the asymptotic range for .
The choice of and in the GCI method requires user judgment calls, for which no single guideline is currently available. We recommend that a single guideline be provided.
The GCIOR and GCI3 methods have almost the same reliability. But the GCI3 method is a little more conservative. Compared to the GCI2 method, the GCIOR and GCI3 methods improve the reliability for . However, they are too conservative for using a factor of safety 3 and not conservative enough for .
The FS1 and FS methods are the same for . For and , the FS1 method is less and more conservative than the FS method for and , respectively. As a result, the FS1 method may have an advantage for uncertainty estimates when where the FS and other verification methods likely predict unreasonably small uncertainties due to small error estimates. However, since the current dataset is restricted to , the pros/cons of using the FS or FS1 method cannot be validated. Thus, until additional data is available for , all verification methods should be used with caution for such conditions and, if possible, additional grid-triplet studies conducted to obtain .
The authors’ statistical approach based on many analytical and numerical benchmarks provides a robust framework for developing solution verification methods. The authors welcome additional validation of the FS method and, if necessary, re-calibration and improvement using additional rigorous verification studies with or available. More research is needed to establish the criterion for achieving the asymptotic range along with its use in providing high quality numerical benchmarks.
This study was sponsored by the Office of Naval Research under Grant No. N000141-01-00-1-7, administered by Dr. Patrick Purtell.