\documentstyle[IEEEtran]{article}
\begin{document}
\tolerance 10000

\title{
Self-validating Computation for Selected Probability
Distribution Functions}
\author{Morgan C. Wang$^1$, Kurt Lin$^2$, and William J. Kennedy$^3$}

\pagestyle{myheadings}

\markright{APIC'95, El Paso,
Extended Abstracts,
A Supplement to the international journal of {\rm Reliable
Computing}\ \ \ \ \ \ \ \ \ \ \  \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ }

\maketitle

\auffil{The authors are with 
$^1$Department of Statistics, University of Central Florida,
P. O. Box 162370, Orlando, Florida 32816-2370,
e-mail: cwang@pegasus.cc.ucf.edu, with the
$^2$Institute for Simulation and Training, University of Central
Florida,
3280 Progress Drive, Orlando, FL 32826,
e-mail: klin@pegasus.cc.ucf.edu,
and with the 
$^3$Department of Statistics, Iowa State University,
Ames, Iowa 50011.
}

\maketitle

\begin{abstract}Self-validating numerical methods based upon interval
analysis for the
computation of selected probability distribution functions are
reported.
\end{abstract}

\section{In Statistical Computing, There is a Need for Guaranteed
Accuracy}
        The need for guaranteed accuracy within stated limits
arises
frequently when computing probabilities and percentiles.  Let us give
two examples:
\begin{itemize}

\item When comparing competing scalar algorithms to see which
yields greater accuracy, or when evaluating a new algorithm, a
reliable source of essential true values is needed.  Existing
tables usually do not provide sufficiently accurate entries, or
cover a sufficiently large region of the variable and parameter
spaces, to be satisfactory for this application.  

\item Another
example occurs whenever a probability function enters as a
factor in an algebraic expression which must be evaluated,
possibly for purposes of tabling.  Accuracy in the end result
will depend in part on the level of accuracy of the computed
probability.  
\begin{itemize}

\item[]For example, the distribution of the sample
correlation coefficient, $r$, from a bivariate normal population
with correlation $\rho$, has central $F$ cumulative distribution
function (CDF) as a factor.  Accurately approximating the CDF
values of the sample correlation coefficient will, of course,
depend in part on the level of accuracy achieved in the
approximation of the central $F$ CDF \cite{Wang1995}.
\end{itemize}
\end{itemize}

\section{Conventional Methods of Error Analysis Are Not Sufficient}
Conventional methods of statistical computing, which use
scalar computation and produces scalar approximations to the desired
statistical characteristics, do not
provide the error bounds for the computed values, i.e., do
not provide the reliability information directly.  

Although
reliability information for conventional method is theoretically
available, it is only obtained as the result of extensive error
analysis.  However, even with extensive error analysis the
reliability information is still not numerically available at
run time and error bounds from classical error analysis tend to
be very conservative.  Therefore, it is difficult to locate the
region of the variable and parameter space in which these scalar
algorithms produce accuracy within a specified level.
 
\section{Self-validating Numerical Methods Based Upon Interval
Analysis}

In this paper, we report 
self-validating numerical methods based upon interval
analysis for the
computation of selected probability distribution functions.

Self-validating computation can be achieved in many different
ways.  We employed {\it interval analysis} to achieve the goal of
self-validation. To get a guaranteed error bound for the computation
results, we used the Hausdorff distance between two intervals
$A = [a, b]$
and $B = [c, d]$ defined as $\max(|a - c|, |b - d|)$.
 
        In this approach, the input data can be taken as either
\begin{itemize}
\item {\it thin intervals}
(numbers), or 
\item {\it thick intervals} (intervals which have width greater
than zero).
\end{itemize}

{\it Rounded interval arithmetic} \cite{Alefeld1983} need
to be used when
implement these numerical methods on a digital computer which
has finite
precision.  
Thus:
\begin {itemize}
\item When the input data are {\it thin intervals}, this
approach
produces a rounded interval which is guaranteed to contain the
theoretically
correct value of the desired probability or percentile.  Since
the Hausdroff
distance between midpoint of the computed interval and the true
value is less than
the half-width of the computed interval, the half-width of the
computed interval
can serve as a guaranteed absolute error bound giving validity
to this midpoint
approximation.  
\item When the input data are {\it thick intervals}, this
approach produces
a rounded interval which is guaranteed to contain the
theoretically correct
interval.  Since the Hausdorff distance between the computed
interval and the
theoretically correct interval is less than the width of the
computed interval, the width of the computed interval can be used as a 
guaranteed
error
bound giving validity to this interval approximation.
\end{itemize}
 
        The major advantage of this approach is the additional
information,
provided by the guaranteed error bounds, about the reliability
of the computed results.  
 
\section{Results and Conclusions}
We have applied this method to statistical computations related to the
following 
probability distribution functions:
\begin{itemize}
\item univariate
normal, 
\item central and non-central chi-square, 
\item central and
non-central $F$, 
\item bivariate normal \cite{Wang1990},
\item multivariate normal \cite{Wang1992}, and 
\item distribution
for the
sample correlation coefficients.
\end{itemize}
The methods suggested in this paper have been extensively
tested and implemented on both IBM PC and SUN workstation.

Excellent results were obtained over very large regions of the
variable and parameter space in every distribution functions
considered.  When a failure occurred, an excessively large
interval resulted.  This served to notify of failure.  

The
methods are not completely fail safe, because the results are
not valid if floating-point underflow or overflow occurs.
However, underflows and overflows can be detected, so there is
a large measure of dependability provided by this methodology.

The authors believe that self-validating computations should be
performed more frequently in statistical computing.

\begin{thebibliography}{99}

\bibitem{Alefeld1983}G. Alefeld and J. Herzberger, 
{\bf Introduction to Interval Computations}, Academic Press, N.Y., 1983.

\bibitem{Corliss1988}G. F. Corliss, ``Application of
Differentiation Arithmetic'', In: {\bf Reliability in Computing: The Role
of Interval Methods in Scientific Computing}, Academic Press, N.Y., 1988.

\bibitem{Corliss1990}G. F. Corliss, ``Industrial
Applications of Interval Techniques'', In: C. Ullrich (ed.), 
{\bf Computer Arithmetic and
Self-Validating Numerical Method}, Acadmeic Press, N.Y., 1990.

\bibitem{Corliss1987}G. F. Corliss and L. B. Rall,
``Adaptive, Self-Validating Numerical Quadrature,"
{\it SIAM Journal on Scientific and Statistical Computing}, 1987, Vol. 8,
pp. 831--847.

\bibitem{Rall1981}L. B. Rall, {\bf Automatic
Differentiation: Techniques and Application}, Lecture Notes in
Computer Science No. 120, Springer Verlag, New York, 1981.

\bibitem{Wang1995}M. C. Wang and W. J. Kennedy,
``A Self-Validating Numerical Method for
Computation of Central and Non-Central F Probabilities and
Percentiles," {\it Statistical and Computation}, 1995 (to appear).

\bibitem{Wang1994}M. C. Wang and W. J. Kennedy, 
``Self-Validating Computations of Probabilities for Selected
Central and Non-central Univariate Probability Functions,"
{\it Journal of the American Statistical Association}, 1994, Vol. 89, pp.
878--887.

\bibitem{Wang1992}M. C. Wang and W. J. Kennedy, 
``A Numerical Method for Accurately Approximating
Multivariate Normal Probabilities," {\it Computational Statistics and
Data Analysis}, 1992, vol. 13, pp. 197--210.

\bibitem{Wang1990}M. C. Wang and W. J. Kennedy, 
``Comparison
of Algorithms for Bivariate Normal Probability Over a Rectangle
Based
on Self-Validating Results From Interval Analysis," {\it Journal of
Statistical Computation and Simulation}, 1990, Vol. 37, pp. 13--25.

\end{thebibliography}

\end{document}

  

