Skip to content

Commit 4813b17

Browse files
Hallberg-NOAAmarshallward
authored andcommitted
*Use Halley method iterations in cuberoot function
Modified the cuberoot function to do 3 iterations with Halley's method starting with a first guess that balances the errors at the two ends of the range of the iterations, before a final iteration with Newton's method that polishes the root and gives a solution that is accurate to machine precision. Following on performance testing of the previous version, all convergence testing has been removed and the same number of iterations are applied regardless of the input value. This changes answers at roundoff for code that uses the cuberoot function, so ideally this PR would be dealt with before the cuberoot becomes widely used.
1 parent d7d126a commit 4813b17

1 file changed

Lines changed: 20 additions & 50 deletions

File tree

src/framework/MOM_intrinsic_functions.F90

Lines changed: 20 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -45,8 +45,6 @@ elemental function cuberoot(x) result(root)
4545
! of the cube root of asx in arbitrary units that can grow or shrink with each iteration [B D]
4646
real :: den_prev ! The denominator of an expression for the previous iteration of the evolving estimate of
4747
! the cube root of asx in arbitrary units that can grow or shrink with each iteration [D]
48-
real, parameter :: den_min = 2.**(minexponent(1.) / 4 + 4) ! A value of den that triggers rescaling [C]
49-
real, parameter :: den_max = 2.**(maxexponent(1.) / 4 - 2) ! A value of den that triggers rescaling [C]
5048
integer :: ex_3 ! One third of the exponent part of x, used to rescale x to get a.
5149
integer :: itt
5250

@@ -58,56 +56,28 @@ elemental function cuberoot(x) result(root)
5856
! Here asx is in the range of 0.125 <= asx < 1.0
5957
asx = scale(abs(x), -3*ex_3)
6058

61-
! This first estimate is one iteration of Newton's method with a starting guess of 1. It is
62-
! always an over-estimate of the true solution, but it is a good approximation for asx near 1.
63-
num = 2.0 + asx
64-
den = 3.0
65-
! Iteratively determine Root = asx**1/3 using Newton's method, noting that in this case Newton's
66-
! method converges monotonically from above and needs no bounding. For the range of asx from
67-
! 0.125 to 1.0 with the first guess used above, 6 iterations suffice to converge to roundoff.
68-
do itt=1,9
69-
! Newton's method iterates estimates as Root = Root - (Root**3 - asx) / (3.0 * Root**2), or
70-
! equivalently as Root = (2.0*Root**2 + asx) / (3.0 * Root**2).
71-
! Keeping the estimates in a fractional form Root = num / den allows this calculation with
72-
! fewer (or no) real divisions during the iterations before doing a single real division
73-
! at the end, and it is therefore more computationally efficient.
74-
59+
! Iteratively determine root_asx = asx**1/3 using Halley's method and then Newton's method,
60+
! noting that Halley's method onverges monotonically and needs no bounding. Halley's method is
61+
! slightly more complicated that Newton's method, but converges in a third fewer iterations.
62+
! Keeping the estimates in a fractional form Root = num / den allows this calculation with
63+
! no real divisions during the iterations before doing a single real division at the end,
64+
! and it is therefore more computationally efficient.
65+
66+
! This first estimate gives the same magnitude of errors for 0.125 and 1.0 after two iterations.
67+
num = 0.707106 ; den = 1.0
68+
do itt=1,3
69+
! Halley's method iterates estimates as Root = Root * (Root**3 + 2.*asx) / (2.*Root**3 + asx).
7570
num_prev = num ; den_prev = den
76-
num = 2.0 * num_prev**3 + asx * den_prev**3
77-
den = 3.0 * (den_prev * num_prev**2)
78-
79-
if ((num * den_prev == num_prev * den) .or. (itt == 9)) then
80-
! If successive estimates of root are identical, this is a converged solution.
81-
root_asx = num / den
82-
exit
83-
elseif (num * den_prev > num_prev * den) then
84-
! If the estimates are increasing, this also indicates convergence, but for a more subtle
85-
! reason. Because Newton's method converges monotonically from above (at least for infinite
86-
! precision math), the only reason why this estimate could increase is if the iterations
87-
! have converged to a roundoff-level limit cycle around an irrational or otherwise
88-
! unrepresentable solution, with values only changing in the last bit or two. If so, we
89-
! should stop iterating and accept the one of the current or previous solutions, both of
90-
! which will be within numerical roundoff of the true solution.
91-
root_asx = num / den
92-
! Pick the more accurate of the last two iterations.
93-
! Given that both of the two previous iterations are within roundoff of the true
94-
! solution, this next step might be overkill.
95-
if ( abs(den_prev**3*root_asx**3 - den_prev**3*asx) > abs(num_prev**3 - den_prev**3*asx) ) then
96-
! The previous iteration was slightly more accurate, so use that for root_asx.
97-
root_asx = num_prev / den_prev
98-
endif
99-
exit
100-
endif
101-
102-
! Because successive estimates of the numerator and denominator tend to be the cube of their
103-
! predecessors, the numerator and denominator need to be rescaled by division when they get
104-
! too large or small to avoid overflow or underflow in the convergence test below.
105-
if ((den > den_max) .or. (den < den_min)) then
106-
num = scale(num, -exponent(den))
107-
den = scale(den, -exponent(den))
108-
endif
109-
71+
num = num_prev * (num_prev**3 + 2.0 * asx * (den_prev**3))
72+
den = den_prev * (2.0 * num_prev**3 + asx * (den_prev**3))
73+
! Equivalent to: root_asx = root_asx * (root_asx**3 + 2.*asx) / (2.*root_asx**3 + asx)
11074
enddo
75+
! At this point the error in root_asx is better than 1 part in 3e14.
76+
root_asx = num / den
77+
78+
! One final iteration with Newton's method polishes up the root and gives a solution
79+
! that is within the last bit of the true solution.
80+
root_asx = root_asx - (root_asx**3 - asx) / (3.0 * (root_asx**2))
11181

11282
root = sign(scale(root_asx, ex_3), x)
11383
endif

0 commit comments

Comments
 (0)