Confidence Interval for a Sample Mean: A simulation

This app randomly samples N data points from a Normal Distribution. The user specifies N. Sample means are computed for each simulated sample.
App created by Bruce Dudek
Click your mouse on the plot to see an explanation of the graph.

Tools for Statistics Instruction using R and Shiny

Author: Bruce Dudek at the University at Albany.

Assistance In R coding was provided by Jason Bryer, University at Albany and Excelsior College.

Built using Shiny by Rstudio and R, the Statistical Programming Language.

The purpose of this app is to provide a visualization that aids in the proper conceptualization of confidence intervals. It is illustrated with confidence intervals for a sample mean. The app is not meant to be a stand alone method of fully understanding confidence intervals, but it can provide a useful accompaniment to initial introduction of the concept in a classroom setting, guided by the instructor.

Key aspects of the conceptualization are:

  1. Confidence intervals are centered on the observed sample mean.

  2. With simulation, we can show what happens when repeated samples are drawn from the same population distribution. The sample mean from these simulated samples will vary according to its own sampling distribution.

  3. Since confidence intervals are centered on the sample mean, these intervals also vary in the region of the Random Variable scale that they span.

  4. If a Confidence level of 95% is chosen, we expect approximately 95% of the simulated intervals to overlap the true location of the population mean.

  5. In our simulation, we have specified the true population mean so we can make this comparison to the “confidence” level. But in realistic analysis we don't know the true value of mu, but have some “confidence” about its location provided by the calculated CI. For example, we will know that with a CI of 95%, that 95% of the time, if we repeated the sample, our computed CI would overlap the true value of mu.

  6. When sigma (population SD) is known, then the confidence interval can be found using std normal Z deviates based on the CI level, IF WE CAN ASSUME THAT THE SAMPLING DISTRIBUTION OF THE MEAN IS NORMAL.

  7. If Sigma is known, the CI calculation defines an interval that is a number of standard errors above and below the sample mean: \(\bar{X}\pm {{Z}_{(1-CI)}}*{({{\sigma }_{X}}}/{\sqrt{N}}\;)\)

  8. If Sigma is unknown, the interval uses a critical t value instead of a std normal Z, it requires df, and an estimate of the standard error which uses the sample standard deviation: \(\bar{X}\pm {{t}_{(1-\text{CI),df}}}*{(s{{d}_{X}}}/{\sqrt{N}}\;)\)

This app was inspired by a type of plot found in Box, Hunter and Hunter (1978, “Statistics for Experimenters”, New York, J. Wiley) This type of plot has been dynamically implemented in a useful javascript applet from the Rossman/Chance collection and in an impressive D3 implementation by Kristoffer Magnusson. R/Shiny code to create a rudimentary version was created by Tyler Hunt and is available in a GitHub repository. The current app has partly modeled on Hunt's code, and has expanded the approach with several additional features.

Ver 1.0, June 28, 2017