| <html> |
| <head> |
| <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII"> |
| <title>The Lanczos Approximation</title> |
| <link rel="stylesheet" href="../../../../../../../doc/src/boostbook.css" type="text/css"> |
| <meta name="generator" content="DocBook XSL Stylesheets V1.74.0"> |
| <link rel="home" href="../../index.html" title="Math Toolkit"> |
| <link rel="up" href="../backgrounders.html" title="Backgrounders"> |
| <link rel="prev" href="relative_error.html" title="Relative Error"> |
| <link rel="next" href="remez.html" title="The Remez Method"> |
| </head> |
| <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> |
| <table cellpadding="2" width="100%"><tr> |
| <td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../../boost.png"></td> |
| <td align="center"><a href="../../../../../../../index.html">Home</a></td> |
| <td align="center"><a href="../../../../../../../libs/libraries.htm">Libraries</a></td> |
| <td align="center"><a href="http://www.boost.org/users/people.html">People</a></td> |
| <td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td> |
| <td align="center"><a href="../../../../../../../more/index.htm">More</a></td> |
| </tr></table> |
| <hr> |
| <div class="spirit-nav"> |
| <a accesskey="p" href="relative_error.html"><img src="../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../backgrounders.html"><img src="../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="remez.html"><img src="../../../../../../../doc/src/images/next.png" alt="Next"></a> |
| </div> |
| <div class="section" lang="en"> |
| <div class="titlepage"><div><div><h3 class="title"> |
| <a name="math_toolkit.backgrounders.lanczos"></a><a class="link" href="lanczos.html" title="The Lanczos Approximation"> The Lanczos Approximation</a> |
| </h3></div></div></div> |
| <a name="math_toolkit.backgrounders.lanczos.motivation"></a><h5> |
| <a name="id1286996"></a> |
| <a class="link" href="lanczos.html#math_toolkit.backgrounders.lanczos.motivation">Motivation</a> |
| </h5> |
| <p> |
| <span class="emphasis"><em>Why base gamma and gamma-like functions on the Lanczos approximation?</em></span> |
| </p> |
| <p> |
| First of all I should make clear that for the gamma function over real numbers |
| (as opposed to complex ones) the Lanczos approximation (See <a href="http://en.wikipedia.org/wiki/Lanczos_approximation" target="_top">Wikipedia |
| or </a> <a href="http://mathworld.wolfram.com/LanczosApproximation.html" target="_top">Mathworld</a>) |
| appears to offer no clear advantage over more traditional methods such as |
| <a href="http://en.wikipedia.org/wiki/Stirling_approximation" target="_top">Stirling's |
| approximation</a>. <a class="link" href="lanczos.html#pugh">Pugh</a> carried out an extensive |
| comparison of the various methods available and discovered that they were |
| all very similar in terms of complexity and relative error. However, the |
| Lanczos approximation does have a couple of properties that make it worthy |
| of further consideration: |
| </p> |
| <div class="itemizedlist"><ul type="disc"> |
| <li> |
| The approximation has an easy to compute truncation error that holds |
| for all <span class="emphasis"><em>z > 0</em></span>. In practice that means we can |
| use the same approximation for all <span class="emphasis"><em>z > 0</em></span>, and |
| be certain that no matter how large or small <span class="emphasis"><em>z</em></span> is, |
| the truncation error will <span class="emphasis"><em>at worst</em></span> be bounded by |
| some finite value. |
| </li> |
| <li> |
| The approximation has a form that is particularly amenable to analytic |
| manipulation, in particular ratios of gamma or gamma-like functions are |
| particularly easy to compute without resorting to logarithms. |
| </li> |
| </ul></div> |
| <p> |
| It is the combination of these two properties that make the approximation |
| attractive: Stirling's approximation is highly accurate for large z, and |
| has some of the same analytic properties as the Lanczos approximation, but |
| can't easily be used across the whole range of z. |
| </p> |
| <p> |
| As the simplest example, consider the ratio of two gamma functions: one could |
| compute the result via lgamma: |
| </p> |
| <pre class="programlisting"><span class="identifier">exp</span><span class="special">(</span><span class="identifier">lgamma</span><span class="special">(</span><span class="identifier">a</span><span class="special">)</span> <span class="special">-</span> <span class="identifier">lgamma</span><span class="special">(</span><span class="identifier">b</span><span class="special">));</span> |
| </pre> |
| <p> |
| However, even if lgamma is uniformly accurate to 0.5ulp, the worst case relative |
| error in the above can easily be shown to be: |
| </p> |
| <pre class="programlisting"><span class="identifier">Erel</span> <span class="special">></span> <span class="identifier">a</span> <span class="special">*</span> <span class="identifier">log</span><span class="special">(</span><span class="identifier">a</span><span class="special">)/</span><span class="number">2</span> <span class="special">+</span> <span class="identifier">b</span> <span class="special">*</span> <span class="identifier">log</span><span class="special">(</span><span class="identifier">b</span><span class="special">)/</span><span class="number">2</span> |
| </pre> |
| <p> |
| For small <span class="emphasis"><em>a</em></span> and <span class="emphasis"><em>b</em></span> that's not a |
| problem, but to put the relationship another way: <span class="emphasis"><em>each time a and |
| b increase in magnitude by a factor of 10, at least one decimal digit of |
| precision will be lost.</em></span> |
| </p> |
| <p> |
| In contrast, by analytically combining like power terms in a ratio of Lanczos |
| approximation's, these errors can be virtually eliminated for small <span class="emphasis"><em>a</em></span> |
| and <span class="emphasis"><em>b</em></span>, and kept under control for very large (or very |
| small for that matter) <span class="emphasis"><em>a</em></span> and <span class="emphasis"><em>b</em></span>. |
| Of course, computing large powers is itself a notoriously hard problem, but |
| even so, analytic combinations of Lanczos approximations can make the difference |
| between obtaining a valid result, or simply garbage. Refer to the implementation |
| notes for the <a class="link" href="../special/sf_beta/beta_function.html" title="Beta">beta</a> |
| function for an example of this method in practice. The incomplete <a class="link" href="../special/sf_gamma/igamma.html" title="Incomplete Gamma Functions">gamma_p gamma</a> and |
| <a class="link" href="../special/sf_beta/ibeta_function.html" title="Incomplete Beta Functions">beta</a> functions |
| use similar analytic combinations of power terms, to combine gamma and beta |
| functions divided by large powers into single (simpler) expressions. |
| </p> |
| <a name="math_toolkit.backgrounders.lanczos.the_approximation"></a><h5> |
| <a name="id1287254"></a> |
| <a class="link" href="lanczos.html#math_toolkit.backgrounders.lanczos.the_approximation">The |
| Approximation</a> |
| </h5> |
| <p> |
| The Lanczos Approximation to the Gamma Function is given by: |
| </p> |
| <p> |
| <span class="inlinemediaobject"><img src="../../../equations/lanczos0.png"></span> |
| </p> |
| <p> |
| Where S<sub>g</sub>(z) is an infinite sum, that is convergent for all z > 0, and |
| <span class="emphasis"><em>g</em></span> is an arbitrary parameter that controls the "shape" |
| of the terms in the sum which is given by: |
| </p> |
| <p> |
| <span class="inlinemediaobject"><img src="../../../equations/lanczos0a.png"></span> |
| </p> |
| <p> |
| With individual coefficients defined in closed form by: |
| </p> |
| <p> |
| <span class="inlinemediaobject"><img src="../../../equations/lanczos0b.png"></span> |
| </p> |
| <p> |
| However, evaluation of the sum in that form can lead to numerical instability |
| in the computation of the ratios of rising and falling factorials (effectively |
| we're multiplying by a series of numbers very close to 1, so roundoff errors |
| can accumulate quite rapidly). |
| </p> |
| <p> |
| The Lanczos approximation is therefore often written in partial fraction |
| form with the leading constants absorbed by the coefficients in the sum: |
| </p> |
| <p> |
| <span class="inlinemediaobject"><img src="../../../equations/lanczos1.png"></span> |
| </p> |
| <p> |
| where: |
| </p> |
| <p> |
| <span class="inlinemediaobject"><img src="../../../equations/lanczos2.png"></span> |
| </p> |
| <p> |
| Again parameter <span class="emphasis"><em>g</em></span> is an arbitrarily chosen constant, |
| and <span class="emphasis"><em>N</em></span> is an arbitrarily chosen number of terms to evaluate |
| in the "Lanczos sum" part. |
| </p> |
| <div class="note"><table border="0" summary="Note"> |
| <tr> |
| <td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../../../../../doc/src/images/note.png"></td> |
| <th align="left">Note</th> |
| </tr> |
| <tr><td align="left" valign="top"><p> |
| Some authors choose to define the sum from k=1 to N, and hence end up with |
| N+1 coefficients. This happens to confuse both the following discussion |
| and the code (since C++ deals with half open array ranges, rather than |
| the closed range of the sum). This convention is consistent with <a class="link" href="lanczos.html#godfrey">Godfrey</a>, but not <a class="link" href="lanczos.html#pugh">Pugh</a>, |
| so take care when referring to the literature in this field. |
| </p></td></tr> |
| </table></div> |
| <a name="math_toolkit.backgrounders.lanczos.computing_the_coefficients"></a><h5> |
| <a name="id1288408"></a> |
| <a class="link" href="lanczos.html#math_toolkit.backgrounders.lanczos.computing_the_coefficients">Computing |
| the Coefficients</a> |
| </h5> |
| <p> |
| The coefficients C0..CN-1 need to be computed from <span class="emphasis"><em>N</em></span> |
| and <span class="emphasis"><em>g</em></span> at high precision, and then stored as part of |
| the program. Calculation of the coefficients is performed via the method |
| of <a class="link" href="lanczos.html#godfrey">Godfrey</a>; let the constants be contained |
| in a column vector P, then: |
| </p> |
| <p> |
| P = B D C F |
| </p> |
| <p> |
| where B is an NxN matrix: |
| </p> |
| <p> |
| <span class="inlinemediaobject"><img src="../../../equations/lanczos4.png"></span> |
| </p> |
| <p> |
| D is an NxN matrix: |
| </p> |
| <p> |
| <span class="inlinemediaobject"><img src="../../../equations/lanczos3.png"></span> |
| </p> |
| <p> |
| C is an NxN matrix: |
| </p> |
| <p> |
| <span class="inlinemediaobject"><img src="../../../equations/lanczos5.png"></span> |
| </p> |
| <p> |
| and F is an N element column vector: |
| </p> |
| <p> |
| <span class="inlinemediaobject"><img src="../../../equations/lanczos6.png"></span> |
| </p> |
| <p> |
| Note than the matrices B, D and C contain all integer terms and depend only |
| on <span class="emphasis"><em>N</em></span>, this product should be computed first, and then |
| multiplied by <span class="emphasis"><em>F</em></span> as the last step. |
| </p> |
| <a name="math_toolkit.backgrounders.lanczos.choosing_the_right_parameters"></a><h5> |
| <a name="id1288565"></a> |
| <a class="link" href="lanczos.html#math_toolkit.backgrounders.lanczos.choosing_the_right_parameters">Choosing |
| the Right Parameters</a> |
| </h5> |
| <p> |
| The trick is to choose <span class="emphasis"><em>N</em></span> and <span class="emphasis"><em>g</em></span> |
| to give the desired level of accuracy: choosing a small value for <span class="emphasis"><em>g</em></span> |
| leads to a strictly convergent series, but one which converges only slowly. |
| Choosing a larger value of <span class="emphasis"><em>g</em></span> causes the terms in the |
| series to be large and/or divergent for about the first <span class="emphasis"><em>g-1</em></span> |
| terms, and to then suddenly converge with a "crunch". |
| </p> |
| <p> |
| <a class="link" href="lanczos.html#pugh">Pugh</a> has determined the optimal value of <span class="emphasis"><em>g</em></span> |
| for <span class="emphasis"><em>N</em></span> in the range <span class="emphasis"><em>1 <= N <= 60</em></span>: |
| unfortunately in practice choosing these values leads to cancellation errors |
| in the Lanczos sum as the largest term in the (alternating) series is approximately |
| 1000 times larger than the result. These optimal values appear not to be |
| useful in practice unless the evaluation can be done with a number of guard |
| digits <span class="emphasis"><em>and</em></span> the coefficients are stored at higher precision |
| than that desired in the result. These values are best reserved for say, |
| computing to float precision with double precision arithmetic. |
| </p> |
| <div class="table"> |
| <a name="math_toolkit.backgrounders.lanczos.optimal_choices_for_n_and_g_when_computing_with_guard_digits__source__pugh_"></a><p class="title"><b>Table 53. Optimal choices for N and g when computing with guard digits (source: |
| Pugh)</b></p> |
| <div class="table-contents"><table class="table" summary="Optimal choices for N and g when computing with guard digits (source: |
| Pugh)"> |
| <colgroup> |
| <col> |
| <col> |
| <col> |
| <col> |
| </colgroup> |
| <thead><tr> |
| <th> |
| <p> |
| Significand Size |
| </p> |
| </th> |
| <th> |
| <p> |
| N |
| </p> |
| </th> |
| <th> |
| <p> |
| g |
| </p> |
| </th> |
| <th> |
| <p> |
| Max Error |
| </p> |
| </th> |
| </tr></thead> |
| <tbody> |
| <tr> |
| <td> |
| <p> |
| 24 |
| </p> |
| </td> |
| <td> |
| <p> |
| 6 |
| </p> |
| </td> |
| <td> |
| <p> |
| 5.581 |
| </p> |
| </td> |
| <td> |
| <p> |
| 9.51e-12 |
| </p> |
| </td> |
| </tr> |
| <tr> |
| <td> |
| <p> |
| 53 |
| </p> |
| </td> |
| <td> |
| <p> |
| 13 |
| </p> |
| </td> |
| <td> |
| <p> |
| 13.144565 |
| </p> |
| </td> |
| <td> |
| <p> |
| 9.2213e-23 |
| </p> |
| </td> |
| </tr> |
| </tbody> |
| </table></div> |
| </div> |
| <br class="table-break"><p> |
| The alternative described by <a class="link" href="lanczos.html#godfrey">Godfrey</a> is to |
| perform an exhaustive search of the <span class="emphasis"><em>N</em></span> and <span class="emphasis"><em>g</em></span> |
| parameter space to determine the optimal combination for a given <span class="emphasis"><em>p</em></span> |
| digit floating-point type. Repeating this work found a good approximation |
| for double precision arithmetic (close to the one <a class="link" href="lanczos.html#godfrey">Godfrey</a> |
| found), but failed to find really good approximations for 80 or 128-bit long |
| doubles. Further it was observed that the approximations obtained tended |
| to optimised for the small values of z (1 < z < 200) used to test the |
| implementation against the factorials. Computing ratios of gamma functions |
| with large arguments were observed to suffer from error resulting from the |
| truncation of the Lancozos series. |
| </p> |
| <p> |
| <a class="link" href="lanczos.html#pugh">Pugh</a> identified all the locations where the theoretical |
| error of the approximation were at a minimum, but unfortunately has published |
| only the largest of these minima. However, he makes the observation that |
| the minima coincide closely with the location where the first neglected term |
| (a<sub>N</sub>) in the Lanczos series S<sub>g</sub>(z) changes sign. These locations are quite |
| easy to locate, albeit with considerable computer time. These "sweet |
| spots" need only be computed once, tabulated, and then searched when |
| required for an approximation that delivers the required precision for some |
| fixed precision type. |
| </p> |
| <p> |
| Unfortunately, following this path failed to find a really good approximation |
| for 128-bit long doubles, and those found for 64 and 80-bit reals required |
| an excessive number of terms. There are two competing issues here: high precision |
| requires a large value of <span class="emphasis"><em>g</em></span>, but avoiding cancellation |
| errors in the evaluation requires a small <span class="emphasis"><em>g</em></span>. |
| </p> |
| <p> |
| At this point note that the Lanczos sum can be converted into rational form |
| (a ratio of two polynomials, obtained from the partial-fraction form using |
| polynomial arithmetic), and doing so changes the coefficients so that <span class="emphasis"><em>they |
| are all positive</em></span>. That means that the sum in rational form can |
| be evaluated without cancellation error, albeit with double the number of |
| coefficients for a given N. Repeating the search of the "sweet spots", |
| this time evaluating the Lanczos sum in rational form, and testing only those |
| "sweet spots" whose theoretical error is less than the machine |
| epsilon for the type being tested, yielded good approximations for all the |
| types tested. The optimal values found were quite close to the best cases |
| reported by <a class="link" href="lanczos.html#pugh">Pugh</a> (just slightly larger <span class="emphasis"><em>N</em></span> |
| and slightly smaller <span class="emphasis"><em>g</em></span> for a given precision than <a class="link" href="lanczos.html#pugh">Pugh</a> reports), and even though converting to rational |
| form doubles the number of stored coefficients, it should be noted that half |
| of them are integers (and therefore require less storage space) and the approximations |
| require a smaller <span class="emphasis"><em>N</em></span> than would otherwise be required, |
| so fewer floating point operations may be required overall. |
| </p> |
| <p> |
| The following table shows the optimal values for <span class="emphasis"><em>N</em></span> and |
| <span class="emphasis"><em>g</em></span> when computing at fixed precision. These should be |
| taken as work in progress: there are no values for 106-bit significand machines |
| (Darwin long doubles & NTL quad_float), and further optimisation of the |
| values of <span class="emphasis"><em>g</em></span> may be possible. Errors given in the table |
| are estimates of the error due to truncation of the Lanczos infinite series |
| to <span class="emphasis"><em>N</em></span> terms. They are calculated from the sum of the |
| first five neglected terms - and are known to be rather pessimistic estimates |
| - although it is noticeable that the best combinations of <span class="emphasis"><em>N</em></span> |
| and <span class="emphasis"><em>g</em></span> occurred when the estimated truncation error almost |
| exactly matches the machine epsilon for the type in question. |
| </p> |
| <div class="table"> |
| <a name="math_toolkit.backgrounders.lanczos.optimum_value_for_n_and_g_when_computing_at_fixed_precision"></a><p class="title"><b>Table 54. Optimum value for N and g when computing at fixed precision</b></p> |
| <div class="table-contents"><table class="table" summary="Optimum value for N and g when computing at fixed precision"> |
| <colgroup> |
| <col> |
| <col> |
| <col> |
| <col> |
| <col> |
| </colgroup> |
| <thead><tr> |
| <th> |
| <p> |
| Significand Size |
| </p> |
| </th> |
| <th> |
| <p> |
| Platform/Compiler Used |
| </p> |
| </th> |
| <th> |
| <p> |
| N |
| </p> |
| </th> |
| <th> |
| <p> |
| g |
| </p> |
| </th> |
| <th> |
| <p> |
| Max Truncation Error |
| </p> |
| </th> |
| </tr></thead> |
| <tbody> |
| <tr> |
| <td> |
| <p> |
| 24 |
| </p> |
| </td> |
| <td> |
| <p> |
| Win32, VC++ 7.1 |
| </p> |
| </td> |
| <td> |
| <p> |
| 6 |
| </p> |
| </td> |
| <td> |
| <p> |
| 1.428456135094165802001953125 |
| </p> |
| </td> |
| <td> |
| <p> |
| 9.41e-007 |
| </p> |
| </td> |
| </tr> |
| <tr> |
| <td> |
| <p> |
| 53 |
| </p> |
| </td> |
| <td> |
| <p> |
| Win32, VC++ 7.1 |
| </p> |
| </td> |
| <td> |
| <p> |
| 13 |
| </p> |
| </td> |
| <td> |
| <p> |
| 6.024680040776729583740234375 |
| </p> |
| </td> |
| <td> |
| <p> |
| 3.23e-016 |
| </p> |
| </td> |
| </tr> |
| <tr> |
| <td> |
| <p> |
| 64 |
| </p> |
| </td> |
| <td> |
| <p> |
| Suse Linux 9 IA64, gcc-3.3.3 |
| </p> |
| </td> |
| <td> |
| <p> |
| 17 |
| </p> |
| </td> |
| <td> |
| <p> |
| 12.2252227365970611572265625 |
| </p> |
| </td> |
| <td> |
| <p> |
| 2.34e-024 |
| </p> |
| </td> |
| </tr> |
| <tr> |
| <td> |
| <p> |
| 116 |
| </p> |
| </td> |
| <td> |
| <p> |
| HP Tru64 Unix 5.1B / Alpha, Compaq C++ V7.1-006 |
| </p> |
| </td> |
| <td> |
| <p> |
| 24 |
| </p> |
| </td> |
| <td> |
| <p> |
| 20.3209821879863739013671875 |
| </p> |
| </td> |
| <td> |
| <p> |
| 4.75e-035 |
| </p> |
| </td> |
| </tr> |
| </tbody> |
| </table></div> |
| </div> |
| <br class="table-break"><p> |
| Finally note that the Lanczos approximation can be written as follows by |
| removing a factor of exp(g) from the denominator, and then dividing all the |
| coefficients by exp(g): |
| </p> |
| <p> |
| <span class="inlinemediaobject"><img src="../../../equations/lanczos7.png"></span> |
| </p> |
| <p> |
| This form is more convenient for calculating lgamma, but for the gamma function |
| the division by <span class="emphasis"><em>e</em></span> turns a possibly exact quality into |
| an inexact value: this reduces accuracy in the common case that the input |
| is exact, and so isn't used for the gamma function. |
| </p> |
| <a name="math_toolkit.backgrounders.lanczos.references"></a><h5> |
| <a name="id1289104"></a> |
| <a class="link" href="lanczos.html#math_toolkit.backgrounders.lanczos.references">References</a> |
| </h5> |
| <a name="godfrey"></a><a name="pugh"></a><div class="orderedlist"><ol type="1"> |
| <li> |
| Paul Godfrey, <a href="http://my.fit.edu/~gabdo/gamma.txt" target="_top">"A |
| note on the computation of the convergent Lanczos complex Gamma approximation"</a>. |
| </li> |
| <li> |
| Glendon Ralph Pugh, <a href="http://bh0.physics.ubc.ca/People/matt/Doc/ThesesOthers/Phd/pugh.pdf" target="_top">"An |
| Analysis of the Lanczos Gamma Approximation"</a>, PhD Thesis |
| November 2004. |
| </li> |
| <li> |
| Viktor T. Toth, <a href="http://www.rskey.org/gamma.htm" target="_top">"Calculators |
| and the Gamma Function"</a>. |
| </li> |
| <li> |
| Mathworld, <a href="http://mathworld.wolfram.com/LanczosApproximation.html" target="_top">The |
| Lanczos Approximation</a>. |
| </li> |
| </ol></div> |
| </div> |
| <table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr> |
| <td align="left"></td> |
| <td align="right"><div class="copyright-footer">Copyright © 2006 , 2007, 2008, 2009, 2010 John Maddock, Paul A. Bristow, |
| Hubert Holin, Xiaogang Zhang, Bruno Lalande, Johan Råde, Gautam Sewani and |
| Thijs van den Berg<p> |
| Distributed under the Boost Software License, Version 1.0. (See accompanying |
| file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>) |
| </p> |
| </div></td> |
| </tr></table> |
| <hr> |
| <div class="spirit-nav"> |
| <a accesskey="p" href="relative_error.html"><img src="../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../backgrounders.html"><img src="../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="remez.html"><img src="../../../../../../../doc/src/images/next.png" alt="Next"></a> |
| </div> |
| </body> |
| </html> |