blob: cf071c264234a1ef908b6b21c4810936486c2037 [file] [log] [blame]
[section:cs_eg Chi Squared Distribution Examples]
[section:chi_sq_intervals Confidence Intervals on the Standard Deviation]
Once you have calculated the standard deviation for your data, a legitimate
question to ask is "How reliable is the calculated standard deviation?".
For this situation the Chi Squared distribution can be used to calculate
confidence intervals for the standard deviation.
The full example code & sample output is in
[@../../../example/chi_square_std_dev_test.cpp chi_square_std_deviation_test.cpp].
We'll begin by defining the procedure that will calculate and print out the
confidence intervals:
void confidence_limits_on_std_deviation(
double Sd, // Sample Standard Deviation
unsigned N) // Sample size
{
We'll begin by printing out some general information:
cout <<
"________________________________________________\n"
"2-Sided Confidence Limits For Standard Deviation\n"
"________________________________________________\n\n";
cout << setprecision(7);
cout << setw(40) << left << "Number of Observations" << "= " << N << "\n";
cout << setw(40) << left << "Standard Deviation" << "= " << Sd << "\n";
and then define a table of significance levels for which we'll calculate
intervals:
double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
The distribution we'll need to calculate the confidence intervals is a
Chi Squared distribution, with N-1 degrees of freedom:
chi_squared dist(N - 1);
For each value of alpha, the formula for the confidence interval is given by:
[equation chi_squ_tut1]
Where [equation chi_squ_tut2] is the upper critical value, and
[equation chi_squ_tut3] is the lower critical value of the
Chi Squared distribution.
In code we begin by printing out a table header:
cout << "\n\n"
"_____________________________________________\n"
"Confidence Lower Upper\n"
" Value (%) Limit Limit\n"
"_____________________________________________\n";
and then loop over the values of alpha and calculate the intervals
for each: remember that the lower critical value is the same as the
quantile, and the upper critical value is the same as the quantile
from the complement of the probability:
for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i)
{
// Confidence value:
cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]);
// Calculate limits:
double lower_limit = sqrt((N - 1) * Sd * Sd / quantile(complement(dist, alpha[i] / 2)));
double upper_limit = sqrt((N - 1) * Sd * Sd / quantile(dist, alpha[i] / 2));
// Print Limits:
cout << fixed << setprecision(5) << setw(15) << right << lower_limit;
cout << fixed << setprecision(5) << setw(15) << right << upper_limit << endl;
}
cout << endl;
To see some example output we'll use the
[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda3581.htm
gear data] from the __handbook.
The data represents measurements of gear diameter from a manufacturing
process.
[pre'''
________________________________________________
2-Sided Confidence Limits For Standard Deviation
________________________________________________
Number of Observations = 100
Standard Deviation = 0.006278908
_____________________________________________
Confidence Lower Upper
Value (%) Limit Limit
_____________________________________________
50.000 0.00601 0.00662
75.000 0.00582 0.00685
90.000 0.00563 0.00712
95.000 0.00551 0.00729
99.000 0.00530 0.00766
99.900 0.00507 0.00812
99.990 0.00489 0.00855
99.999 0.00474 0.00895
''']
So at the 95% confidence level we conclude that the standard deviation
is between 0.00551 and 0.00729.
[h4 Confidence intervals as a function of the number of observations]
Similarly, we can also list the confidence intervals for the standard deviation
for the common confidence levels 95%, for increasing numbers of observations.
The standard deviation used to compute these values is unity,
so the limits listed are *multipliers* for any particular standard deviation.
For example, given a standard deviation of 0.0062789 as in the example
above; for 100 observations the multiplier is 0.8780
giving the lower confidence limit of 0.8780 * 0.006728 = 0.00551.
[pre'''
____________________________________________________
Confidence level (two-sided) = 0.0500000
Standard Deviation = 1.0000000
________________________________________
Observations Lower Upper
Limit Limit
________________________________________
2 0.4461 31.9102
3 0.5207 6.2847
4 0.5665 3.7285
5 0.5991 2.8736
6 0.6242 2.4526
7 0.6444 2.2021
8 0.6612 2.0353
9 0.6755 1.9158
10 0.6878 1.8256
15 0.7321 1.5771
20 0.7605 1.4606
30 0.7964 1.3443
40 0.8192 1.2840
50 0.8353 1.2461
60 0.8476 1.2197
100 0.8780 1.1617
120 0.8875 1.1454
1000 0.9580 1.0459
10000 0.9863 1.0141
50000 0.9938 1.0062
100000 0.9956 1.0044
1000000 0.9986 1.0014
''']
With just 2 observations the limits are from *0.445* up to to *31.9*,
so the standard deviation might be about *half*
the observed value up to *30 times* the observed value!
Estimating a standard deviation with just a handful of values leaves a very great uncertainty,
especially the upper limit.
Note especially how far the upper limit is skewed from the most likely standard deviation.
Even for 10 observations, normally considered a reasonable number,
the range is still from 0.69 to 1.8, about a range of 0.7 to 2,
and is still highly skewed with an upper limit *twice* the median.
When we have 1000 observations, the estimate of the standard deviation is starting to look convincing,
with a range from 0.95 to 1.05 - now near symmetrical, but still about + or - 5%.
Only when we have 10000 or more repeated observations can we start to be reasonably confident
(provided we are sure that other factors like drift are not creeping in).
For 10000 observations, the interval is 0.99 to 1.1 - finally a really convincing + or -1% confidence.
[endsect][/section:chi_sq_intervals Confidence Intervals on the Standard Deviation]
[section:chi_sq_test Chi-Square Test for the Standard Deviation]
We use this test to determine whether the standard deviation of a sample
differs from a specified value. Typically this occurs in process change
situations where we wish to compare the standard deviation of a new
process to an established one.
The code for this example is contained in
[@../../../example/chi_square_std_dev_test.cpp chi_square_std_dev_test.cpp], and
we'll begin by defining the procedure that will print out the test
statistics:
void chi_squared_test(
double Sd, // Sample std deviation
double D, // True std deviation
unsigned N, // Sample size
double alpha) // Significance level
{
The procedure begins by printing a summary of the input data:
using namespace std;
using namespace boost::math;
// Print header:
cout <<
"______________________________________________\n"
"Chi Squared test for sample standard deviation\n"
"______________________________________________\n\n";
cout << setprecision(5);
cout << setw(55) << left << "Number of Observations" << "= " << N << "\n";
cout << setw(55) << left << "Sample Standard Deviation" << "= " << Sd << "\n";
cout << setw(55) << left << "Expected True Standard Deviation" << "= " << D << "\n\n";
The test statistic (T) is simply the ratio of the sample and "true" standard
deviations squared, multiplied by the number of degrees of freedom (the
sample size less one):
double t_stat = (N - 1) * (Sd / D) * (Sd / D);
cout << setw(55) << left << "Test Statistic" << "= " << t_stat << "\n";
The distribution we need to use, is a Chi Squared distribution with N-1
degrees of freedom:
chi_squared dist(N - 1);
The various hypothesis that can be tested are summarised in the following table:
[table
[[Hypothesis][Test]]
[[The null-hypothesis: there is no difference in standard deviation from the specified value]
[ Reject if T < [chi][super 2][sub (1-alpha/2; N-1)] or T > [chi][super 2][sub (alpha/2; N-1)] ]]
[[The alternative hypothesis: there is a difference in standard deviation from the specified value]
[ Reject if [chi][super 2][sub (1-alpha/2; N-1)] >= T >= [chi][super 2][sub (alpha/2; N-1)] ]]
[[The alternative hypothesis: the standard deviation is less than the specified value]
[ Reject if [chi][super 2][sub (1-alpha; N-1)] <= T ]]
[[The alternative hypothesis: the standard deviation is greater than the specified value]
[ Reject if [chi][super 2][sub (alpha; N-1)] >= T ]]
]
Where [chi][super 2][sub (alpha; N-1)] is the upper critical value of the
Chi Squared distribution, and [chi][super 2][sub (1-alpha; N-1)] is the
lower critical value.
Recall that the lower critical value is the same
as the quantile, and the upper critical value is the same as the quantile
from the complement of the probability, that gives us the following code
to calculate the critical values:
double ucv = quantile(complement(dist, alpha));
double ucv2 = quantile(complement(dist, alpha / 2));
double lcv = quantile(dist, alpha);
double lcv2 = quantile(dist, alpha / 2);
cout << setw(55) << left << "Upper Critical Value at alpha: " << "= "
<< setprecision(3) << scientific << ucv << "\n";
cout << setw(55) << left << "Upper Critical Value at alpha/2: " << "= "
<< setprecision(3) << scientific << ucv2 << "\n";
cout << setw(55) << left << "Lower Critical Value at alpha: " << "= "
<< setprecision(3) << scientific << lcv << "\n";
cout << setw(55) << left << "Lower Critical Value at alpha/2: " << "= "
<< setprecision(3) << scientific << lcv2 << "\n\n";
Now that we have the critical values, we can compare these to our test
statistic, and print out the result of each hypothesis and test:
cout << setw(55) << left <<
"Results for Alternative Hypothesis and alpha" << "= "
<< setprecision(4) << fixed << alpha << "\n\n";
cout << "Alternative Hypothesis Conclusion\n";
cout << "Standard Deviation != " << setprecision(3) << fixed << D << " ";
if((ucv2 < t_stat) || (lcv2 > t_stat))
cout << "ACCEPTED\n";
else
cout << "REJECTED\n";
cout << "Standard Deviation < " << setprecision(3) << fixed << D << " ";
if(lcv > t_stat)
cout << "ACCEPTED\n";
else
cout << "REJECTED\n";
cout << "Standard Deviation > " << setprecision(3) << fixed << D << " ";
if(ucv < t_stat)
cout << "ACCEPTED\n";
else
cout << "REJECTED\n";
cout << endl << endl;
To see some example output we'll use the
[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda3581.htm
gear data] from the __handbook.
The data represents measurements of gear diameter from a manufacturing
process. The program output is deliberately designed to mirror
the DATAPLOT output shown in the
[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda358.htm
NIST Handbook Example].
[pre'''
______________________________________________
Chi Squared test for sample standard deviation
______________________________________________
Number of Observations = 100
Sample Standard Deviation = 0.00628
Expected True Standard Deviation = 0.10000
Test Statistic = 0.39030
CDF of test statistic: = 1.438e-099
Upper Critical Value at alpha: = 1.232e+002
Upper Critical Value at alpha/2: = 1.284e+002
Lower Critical Value at alpha: = 7.705e+001
Lower Critical Value at alpha/2: = 7.336e+001
Results for Alternative Hypothesis and alpha = 0.0500
Alternative Hypothesis Conclusion'''
Standard Deviation != 0.100 ACCEPTED
Standard Deviation < 0.100 ACCEPTED
Standard Deviation > 0.100 REJECTED
]
In this case we are testing whether the sample standard deviation is 0.1,
and the null-hypothesis is rejected, so we conclude that the standard
deviation ['is not] 0.1.
For an alternative example, consider the
[@http://www.itl.nist.gov/div898/handbook/prc/section2/prc23.htm
silicon wafer data] again from the __handbook.
In this scenario a supplier of 100 ohm.cm silicon wafers claims
that his fabrication process can produce wafers with sufficient
consistency so that the standard deviation of resistivity for
the lot does not exceed 10 ohm.cm. A sample of N = 10 wafers taken
from the lot has a standard deviation of 13.97 ohm.cm, and the question
we ask ourselves is "Is the suppliers claim correct?".
The program output now looks like this:
[pre'''
______________________________________________
Chi Squared test for sample standard deviation
______________________________________________
Number of Observations = 10
Sample Standard Deviation = 13.97000
Expected True Standard Deviation = 10.00000
Test Statistic = 17.56448
CDF of test statistic: = 9.594e-001
Upper Critical Value at alpha: = 1.692e+001
Upper Critical Value at alpha/2: = 1.902e+001
Lower Critical Value at alpha: = 3.325e+000
Lower Critical Value at alpha/2: = 2.700e+000
Results for Alternative Hypothesis and alpha = 0.0500
Alternative Hypothesis Conclusion'''
Standard Deviation != 10.000 REJECTED
Standard Deviation < 10.000 REJECTED
Standard Deviation > 10.000 ACCEPTED
]
In this case, our null-hypothesis is that the standard deviation of
the sample is less than 10: this hypothesis is rejected in the analysis
above, and so we reject the manufacturers claim.
[endsect][/section:chi_sq_test Chi-Square Test for the Standard Deviation]
[section:chi_sq_size Estimating the Required Sample Sizes for a Chi-Square Test for the Standard Deviation]
Suppose we conduct a Chi Squared test for standard deviation and the result
is borderline, a legitimate question to ask is "How large would the sample size
have to be in order to produce a definitive result?"
The class template [link math_toolkit.dist.dist_ref.dists.chi_squared_dist
chi_squared_distribution] has a static method
`find_degrees_of_freedom` that will calculate this value for
some acceptable risk of type I failure /alpha/, type II failure
/beta/, and difference from the standard deviation /diff/. Please
note that the method used works on variance, and not standard deviation
as is usual for the Chi Squared Test.
The code for this example is located in [@../../../example/chi_square_std_dev_test.cpp
chi_square_std_dev_test.cpp].
We begin by defining a procedure to print out the sample sizes required
for various risk levels:
void chi_squared_sample_sized(
double diff, // difference from variance to detect
double variance) // true variance
{
The procedure begins by printing out the input data:
using namespace std;
using namespace boost::math;
// Print out general info:
cout <<
"_____________________________________________________________\n"
"Estimated sample sizes required for various confidence levels\n"
"_____________________________________________________________\n\n";
cout << setprecision(5);
cout << setw(40) << left << "True Variance" << "= " << variance << "\n";
cout << setw(40) << left << "Difference to detect" << "= " << diff << "\n";
And defines a table of significance levels for which we'll calculate sample sizes:
double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
For each value of alpha we can calculate two sample sizes: one where the
sample variance is less than the true value by /diff/ and one
where it is greater than the true value by /diff/. Thanks to the
asymmetric nature of the Chi Squared distribution these two values will
not be the same, the difference in their calculation differs only in the
sign of /diff/ that's passed to `find_degrees_of_freedom`. Finally
in this example we'll simply things, and let risk level /beta/ be the
same as /alpha/:
cout << "\n\n"
"_______________________________________________________________\n"
"Confidence Estimated Estimated\n"
" Value (%) Sample Size Sample Size\n"
" (lower one (upper one\n"
" sided test) sided test)\n"
"_______________________________________________________________\n";
//
// Now print out the data for the table rows.
//
for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i)
{
// Confidence value:
cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]);
// calculate df for a lower single sided test:
double df = chi_squared::find_degrees_of_freedom(
-diff, alpha[i], alpha[i], variance);
// convert to sample size:
double size = ceil(df) + 1;
// Print size:
cout << fixed << setprecision(0) << setw(16) << right << size;
// calculate df for an upper single sided test:
df = chi_squared::find_degrees_of_freedom(
diff, alpha[i], alpha[i], variance);
// convert to sample size:
size = ceil(df) + 1;
// Print size:
cout << fixed << setprecision(0) << setw(16) << right << size << endl;
}
cout << endl;
For some example output, consider the
[@http://www.itl.nist.gov/div898/handbook/prc/section2/prc23.htm
silicon wafer data] from the __handbook.
In this scenario a supplier of 100 ohm.cm silicon wafers claims
that his fabrication process can produce wafers with sufficient
consistency so that the standard deviation of resistivity for
the lot does not exceed 10 ohm.cm. A sample of N = 10 wafers taken
from the lot has a standard deviation of 13.97 ohm.cm, and the question
we ask ourselves is "How large would our sample have to be to reliably
detect this difference?".
To use our procedure above, we have to convert the
standard deviations to variance (square them),
after which the program output looks like this:
[pre'''
_____________________________________________________________
Estimated sample sizes required for various confidence levels
_____________________________________________________________
True Variance = 100.00000
Difference to detect = 95.16090
_______________________________________________________________
Confidence Estimated Estimated
Value (%) Sample Size Sample Size
(lower one (upper one
sided test) sided test)
_______________________________________________________________
50.000 2 2
75.000 2 10
90.000 4 32
95.000 5 51
99.000 7 99
99.900 11 174
99.990 15 251
99.999 20 330'''
]
In this case we are interested in a upper single sided test.
So for example, if the maximum acceptable risk of falsely rejecting
the null-hypothesis is 0.05 (Type I error), and the maximum acceptable
risk of failing to reject the null-hypothesis is also 0.05
(Type II error), we estimate that we would need a sample size of 51.
[endsect][/section:chi_sq_size Estimating the Required Sample Sizes for a Chi-Square Test for the Standard Deviation]
[endsect][/section:cs_eg Chi Squared Distribution]
[/
Copyright 2006 John Maddock and Paul A. Bristow.
Distributed under the Boost Software License, Version 1.0.
(See accompanying file LICENSE_1_0.txt or copy at
http://www.boost.org/LICENSE_1_0.txt).
]