| <html> |
| <head> |
| <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII"> |
| <title>Calculating confidence intervals on the mean with the Students-t distribution</title> |
| <link rel="stylesheet" href="../../../../../../../../../../doc/src/boostbook.css" type="text/css"> |
| <meta name="generator" content="DocBook XSL Stylesheets V1.74.0"> |
| <link rel="home" href="../../../../../index.html" title="Math Toolkit"> |
| <link rel="up" href="../st_eg.html" title="Student's t Distribution Examples"> |
| <link rel="prev" href="../st_eg.html" title="Student's t Distribution Examples"> |
| <link rel="next" href="tut_mean_test.html" title='Testing a sample mean for difference from a "true" mean'> |
| </head> |
| <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> |
| <table cellpadding="2" width="100%"><tr> |
| <td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../../../../../boost.png"></td> |
| <td align="center"><a href="../../../../../../../../../../index.html">Home</a></td> |
| <td align="center"><a href="../../../../../../../../../../libs/libraries.htm">Libraries</a></td> |
| <td align="center"><a href="http://www.boost.org/users/people.html">People</a></td> |
| <td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td> |
| <td align="center"><a href="../../../../../../../../../../more/index.htm">More</a></td> |
| </tr></table> |
| <hr> |
| <div class="spirit-nav"> |
| <a accesskey="p" href="../st_eg.html"><img src="../../../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../st_eg.html"><img src="../../../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../../index.html"><img src="../../../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="tut_mean_test.html"><img src="../../../../../../../../../../doc/src/images/next.png" alt="Next"></a> |
| </div> |
| <div class="section" lang="en"> |
| <div class="titlepage"><div><div><h6 class="title"> |
| <a name="math_toolkit.dist.stat_tut.weg.st_eg.tut_mean_intervals"></a><a class="link" href="tut_mean_intervals.html" title="Calculating confidence intervals on the mean with the Students-t distribution"> |
| Calculating confidence intervals on the mean with the Students-t distribution</a> |
| </h6></div></div></div> |
| <p> |
| Let's say you have a sample mean, you may wish to know what confidence |
| intervals you can place on that mean. Colloquially: "I want an |
| interval that I can be P% sure contains the true mean". (On a |
| technical point, note that the interval either contains the true mean |
| or it does not: the meaning of the confidence level is subtly different |
| from this colloquialism. More background information can be found on |
| the <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm" target="_top">NIST |
| site</a>). |
| </p> |
| <p> |
| The formula for the interval can be expressed as: |
| </p> |
| <p> |
| <span class="inlinemediaobject"><img src="../../../../../../equations/dist_tutorial4.png"></span> |
| </p> |
| <p> |
| Where, <span class="emphasis"><em>Y<sub>s</sub></em></span> is the sample mean, <span class="emphasis"><em>s</em></span> |
| is the sample standard deviation, <span class="emphasis"><em>N</em></span> is the sample |
| size, <span class="emphasis"><em>[alpha]</em></span> is the desired significance level |
| and <span class="emphasis"><em>t<sub>(α/2,N-1)</sub></em></span> is the upper critical value of the |
| Students-t distribution with <span class="emphasis"><em>N-1</em></span> degrees of freedom. |
| </p> |
| <div class="note"><table border="0" summary="Note"> |
| <tr> |
| <td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../../../../../../../../doc/src/images/note.png"></td> |
| <th align="left">Note</th> |
| </tr> |
| <tr><td align="left" valign="top"> |
| <p> |
| The quantity α ​ is the maximum acceptable risk of falsely rejecting |
| the null-hypothesis. The smaller the value of α the greater the strength |
| of the test. |
| </p> |
| <p> |
| The confidence level of the test is defined as 1 - α, and often expressed |
| as a percentage. So for example a significance level of 0.05, is |
| equivalent to a 95% confidence level. Refer to <a href="http://www.itl.nist.gov/div898/handbook/prc/section1/prc14.htm" target="_top">"What |
| are confidence intervals?"</a> in <a href="http://www.itl.nist.gov/div898/handbook/" target="_top">NIST/SEMATECH |
| e-Handbook of Statistical Methods.</a> for more information. |
| </p> |
| </td></tr> |
| </table></div> |
| <div class="note"><table border="0" summary="Note"> |
| <tr> |
| <td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../../../../../../../../doc/src/images/note.png"></td> |
| <th align="left">Note</th> |
| </tr> |
| <tr><td align="left" valign="top"><p> |
| The usual assumptions of <a href="http://en.wikipedia.org/wiki/Independent_and_identically-distributed_random_variables" target="_top">independent |
| and identically distributed (i.i.d.)</a> variables and <a href="http://en.wikipedia.org/wiki/Normal_distribution" target="_top">normal distribution</a> |
| of course apply here, as they do in other examples. |
| </p></td></tr> |
| </table></div> |
| <p> |
| From the formula, it should be clear that: |
| </p> |
| <div class="itemizedlist"><ul type="disc"> |
| <li> |
| The width of the confidence interval decreases as the sample size |
| increases. |
| </li> |
| <li> |
| The width increases as the standard deviation increases. |
| </li> |
| <li> |
| The width increases as the <span class="emphasis"><em>confidence level increases</em></span> |
| (0.5 towards 0.99999 - stronger). |
| </li> |
| <li> |
| The width increases as the <span class="emphasis"><em>significance level decreases</em></span> |
| (0.5 towards 0.00000...01 - stronger). |
| </li> |
| </ul></div> |
| <p> |
| The following example code is taken from the example program <a href="../../../../../../../../example/students_t_single_sample.cpp" target="_top">students_t_single_sample.cpp</a>. |
| </p> |
| <p> |
| We'll begin by defining a procedure to calculate intervals for various |
| confidence levels; the procedure will print these out as a table: |
| </p> |
| <pre class="programlisting"><span class="comment">// Needed includes: |
| </span><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">math</span><span class="special">/</span><span class="identifier">distributions</span><span class="special">/</span><span class="identifier">students_t</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> |
| <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> |
| <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iomanip</span><span class="special">></span> |
| <span class="comment">// Bring everything into global namespace for ease of use: |
| </span><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">;</span> |
| <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">std</span><span class="special">;</span> |
| |
| <span class="keyword">void</span> <span class="identifier">confidence_limits_on_mean</span><span class="special">(</span> |
| <span class="keyword">double</span> <span class="identifier">Sm</span><span class="special">,</span> <span class="comment">// Sm = Sample Mean. |
| </span> <span class="keyword">double</span> <span class="identifier">Sd</span><span class="special">,</span> <span class="comment">// Sd = Sample Standard Deviation. |
| </span> <span class="keyword">unsigned</span> <span class="identifier">Sn</span><span class="special">)</span> <span class="comment">// Sn = Sample Size. |
| </span><span class="special">{</span> |
| <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">std</span><span class="special">;</span> |
| <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">;</span> |
| |
| <span class="comment">// Print out general info: |
| </span> <span class="identifier">cout</span> <span class="special"><<</span> |
| <span class="string">"__________________________________\n"</span> |
| <span class="string">"2-Sided Confidence Limits For Mean\n"</span> |
| <span class="string">"__________________________________\n\n"</span><span class="special">;</span> |
| <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">7</span><span class="special">);</span> |
| <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">40</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">left</span> <span class="special"><<</span> <span class="string">"Number of Observations"</span> <span class="special"><<</span> <span class="string">"= "</span> <span class="special"><<</span> <span class="identifier">Sn</span> <span class="special"><<</span> <span class="string">"\n"</span><span class="special">;</span> |
| <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">40</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">left</span> <span class="special"><<</span> <span class="string">"Mean"</span> <span class="special"><<</span> <span class="string">"= "</span> <span class="special"><<</span> <span class="identifier">Sm</span> <span class="special"><<</span> <span class="string">"\n"</span><span class="special">;</span> |
| <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">40</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">left</span> <span class="special"><<</span> <span class="string">"Standard Deviation"</span> <span class="special"><<</span> <span class="string">"= "</span> <span class="special"><<</span> <span class="identifier">Sd</span> <span class="special"><<</span> <span class="string">"\n"</span><span class="special">;</span> |
| </pre> |
| <p> |
| We'll define a table of significance/risk levels for which we'll compute |
| intervals: |
| </p> |
| <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">alpha</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">0.5</span><span class="special">,</span> <span class="number">0.25</span><span class="special">,</span> <span class="number">0.1</span><span class="special">,</span> <span class="number">0.05</span><span class="special">,</span> <span class="number">0.01</span><span class="special">,</span> <span class="number">0.001</span><span class="special">,</span> <span class="number">0.0001</span><span class="special">,</span> <span class="number">0.00001</span> <span class="special">};</span> |
| </pre> |
| <p> |
| Note that these are the complements of the confidence/probability levels: |
| 0.5, 0.75, 0.9 .. 0.99999). |
| </p> |
| <p> |
| Next we'll declare the distribution object we'll need, note that the |
| <span class="emphasis"><em>degrees of freedom</em></span> parameter is the sample size |
| less one: |
| </p> |
| <pre class="programlisting"><span class="identifier">students_t</span> <span class="identifier">dist</span><span class="special">(</span><span class="identifier">Sn</span> <span class="special">-</span> <span class="number">1</span><span class="special">);</span> |
| </pre> |
| <p> |
| Most of what follows in the program is pretty printing, so let's focus |
| on the calculation of the interval. First we need the t-statistic, |
| computed using the <span class="emphasis"><em>quantile</em></span> function and our significance |
| level. Note that since the significance levels are the complement of |
| the probability, we have to wrap the arguments in a call to <span class="emphasis"><em>complement(...)</em></span>: |
| </p> |
| <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">T</span> <span class="special">=</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">dist</span><span class="special">,</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special">/</span> <span class="number">2</span><span class="special">));</span> |
| </pre> |
| <p> |
| Note that alpha was divided by two, since we'll be calculating both |
| the upper and lower bounds: had we been interested in a single sided |
| interval then we would have omitted this step. |
| </p> |
| <p> |
| Now to complete the picture, we'll get the (one-sided) width of the |
| interval from the t-statistic by multiplying by the standard deviation, |
| and dividing by the square root of the sample size: |
| </p> |
| <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">w</span> <span class="special">=</span> <span class="identifier">T</span> <span class="special">*</span> <span class="identifier">Sd</span> <span class="special">/</span> <span class="identifier">sqrt</span><span class="special">(</span><span class="keyword">double</span><span class="special">(</span><span class="identifier">Sn</span><span class="special">));</span> |
| </pre> |
| <p> |
| The two-sided interval is then the sample mean plus and minus this |
| width. |
| </p> |
| <p> |
| And apart from some more pretty-printing that completes the procedure. |
| </p> |
| <p> |
| Let's take a look at some sample output, first using the <a href="http://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htm" target="_top">Heat |
| flow data</a> from the NIST site. The data set was collected by |
| Bob Zarr of NIST in January, 1990 from a heat flow meter calibration |
| and stability analysis. The corresponding dataplot output for this |
| test can be found in <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm" target="_top">section |
| 3.5.2</a> of the <a href="http://www.itl.nist.gov/div898/handbook/" target="_top">NIST/SEMATECH |
| e-Handbook of Statistical Methods.</a>. |
| </p> |
| <pre class="programlisting"> __________________________________ |
| 2-Sided Confidence Limits For Mean |
| __________________________________ |
| |
| Number of Observations = 195 |
| Mean = 9.26146 |
| Standard Deviation = 0.02278881 |
| |
| |
| ___________________________________________________________________ |
| Confidence T Interval Lower Upper |
| Value (%) Value Width Limit Limit |
| ___________________________________________________________________ |
| 50.000 0.676 1.103e-003 9.26036 9.26256 |
| 75.000 1.154 1.883e-003 9.25958 9.26334 |
| 90.000 1.653 2.697e-003 9.25876 9.26416 |
| 95.000 1.972 3.219e-003 9.25824 9.26468 |
| 99.000 2.601 4.245e-003 9.25721 9.26571 |
| 99.900 3.341 5.453e-003 9.25601 9.26691 |
| 99.990 3.973 6.484e-003 9.25498 9.26794 |
| 99.999 4.537 7.404e-003 9.25406 9.26886 |
| </pre> |
| <p> |
| As you can see the large sample size (195) and small standard deviation |
| (0.023) have combined to give very small intervals, indeed we can be |
| very confident that the true mean is 9.2. |
| </p> |
| <p> |
| For comparison the next example data output is taken from <span class="emphasis"><em>P.K.Hou, |
| O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64. and from |
| Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55 J. C. |
| Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.</em></span> |
| The values result from the determination of mercury by cold-vapour |
| atomic absorption. |
| </p> |
| <pre class="programlisting"> __________________________________ |
| 2-Sided Confidence Limits For Mean |
| __________________________________ |
| |
| Number of Observations = 3 |
| Mean = 37.8000000 |
| Standard Deviation = 0.9643650 |
| |
| |
| ___________________________________________________________________ |
| Confidence T Interval Lower Upper |
| Value (%) Value Width Limit Limit |
| ___________________________________________________________________ |
| 50.000 0.816 0.455 37.34539 38.25461 |
| 75.000 1.604 0.893 36.90717 38.69283 |
| 90.000 2.920 1.626 36.17422 39.42578 |
| 95.000 4.303 2.396 35.40438 40.19562 |
| 99.000 9.925 5.526 32.27408 43.32592 |
| 99.900 31.599 17.594 20.20639 55.39361 |
| 99.990 99.992 55.673 -17.87346 93.47346 |
| 99.999 316.225 176.067 -138.26683 213.86683 |
| </pre> |
| <p> |
| This time the fact that there are only three measurements leads to |
| much wider intervals, indeed such large intervals that it's hard to |
| be very confident in the location of the mean. |
| </p> |
| </div> |
| <table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr> |
| <td align="left"></td> |
| <td align="right"><div class="copyright-footer">Copyright © 2006 , 2007, 2008, 2009, 2010 John Maddock, Paul A. Bristow, |
| Hubert Holin, Xiaogang Zhang, Bruno Lalande, Johan Råde, Gautam Sewani and |
| Thijs van den Berg<p> |
| Distributed under the Boost Software License, Version 1.0. (See accompanying |
| file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>) |
| </p> |
| </div></td> |
| </tr></table> |
| <hr> |
| <div class="spirit-nav"> |
| <a accesskey="p" href="../st_eg.html"><img src="../../../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../st_eg.html"><img src="../../../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../../index.html"><img src="../../../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="tut_mean_test.html"><img src="../../../../../../../../../../doc/src/images/next.png" alt="Next"></a> |
| </div> |
| </body> |
| </html> |