Blog Archive

Monday, December 3, 2012

Confidence Intervals in Python



from scipy import stats
import scipy as sp
import numpy as np

s = np.array([1,2,3,4,4,4,5,5,5,5,4,4,4,6,7,8])
n, min_max, mean, var, skew, kurt = stats.describe(s)
std=math.sqrt(var)

#note these are sample standard deviations 
#and sample variance values
#to get population values s.std() and s.var() will work


#The location (loc) keyword specifies the mean.
#The scale (scale) keyword specifies the standard deviation.

# We will assume a normal distribution
R = stats.norm.interval(0.05,loc=mean,scale=std)


>>> R
(4.33017855099411, 4.54482144900589)

4 comments:

  1. With that you get the 5% confidence intervals, probably not what you were looking for. Should be:
    R = stats.norm.interval(0.95,loc=mean,scale=std)

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
  2. Actually, the above is also incorrect. Should be:
    R = stats.norm.interval(0.95,loc=mean,scale=std/math.sqrt(len(s)))
    Also, you could use the same approach with the t-distribution,
    which is more appropriate as the number of values is small:
    R = stats.t.interval(0.95,len(s)-1,loc=mean,scale=std/math.sqrt(len(s)))

    Anyway, a neat way to calculate the confidence intervals...

    ReplyDelete