mathepi.com

Home   About us   Mathematical Epidemiology   Rweb   EPITools   Statistics Notes   Web Design   Contact us   Links
 
> Home > Statistics Notes > Probability > Variance

Variance

     In general, the variance (true variance, population variance) is the expected squared deviation from the expected value. This quantity can be calculated from the probability distribution; in rare cases it does not exist.
Let's calculate this for some Binomial random variables. Let's look at N=50, and the success probability of 0.1. The expected value is 50 times 0.1, or 5.
> w <- rbinom(10000,size=50,prob=0.1)
> w1 <- (w - 50*0.1)^2
> mean(w1)
> 50*0.1*0.9
What we did was to take a large sample, calculate the squared deviation from the expected value, and use the sample mean of a large sample of these values to approximate the true expected value of this quantity (the squared difference (or deviation) from the expected value). It is possible to show that the expected squared deviation from the expected value for a binomial random variable is N*p*(1-p). This quantity is called the variance; thus the variance of the binomial is given by N*p*(1-p).
     Now, recall that the sample mean was the particular number that minimized the squared deviations for any one particular sample of data. So if I have a particular set of data, the squared deviation from the expected value is always larger than the squared deviation from the sample mean. > xx <- rbinom(20,size=50,prob=0.1)
> mean((xx-50*0.1)^2)
> mean((xx-mean(xx))^2)
> sum((xx-mean(xx))^2)/(20-1)
> var(xx)
So if I wanted to try to estimate the squared deviation from the expected value using the sample mean of the data in place of the true expected value, I am going to be getting numbers that are too small.
     The sample variance is defined to be the sum of the squared deviations from the sample mean of a data set, divided by n-1. The division by n-1, rather than n, compensates for the fact that the numerator is a little too small because we use the sample mean rather than the true expectation value. The sample variance is said to estimate the true variance.
     The variance is a measure of the amount of variability in a random quantity. If the variance is zero, then the random quantity must (with probability 1) always have the same value.
 

 

Return to statistics page.
Return to probability page.
Return to stochastic seminar.

All content © 2000 Mathepi.Com (except R and Rweb).
About us.