Subsections

# Confidence Intervals

We call the unknown parameter and our estimate . Suppose that we had an ideal (unrealistic) situation in which we knew the distribution of , we will be interested especially in its quantiles : denote the quantile by and the quantile by .

By definition we have:

1. Would n't we get the same answer without centering with regards to ?

What we called and were the ideal quantiles from which we build the confidence interval : .Estimated by : using the th quantile of , the 's do NOT cancel out. We can show why.

2. Would these intervals be the same if we took the distribution of to mimick simply that of the 's ?

NO !

## Studentized Confidence Intervals

### Correlation Coefficient Example

function out=corr(orig)
%Correlation coefficient
c1=corrcoef(orig);
out=c1(1,2);
%------------------------------------------
function interv=boott(orig,theta,B,sdB,alpha)
%Studentized bootstrap confidence intervals
%
theta0=feval(theta,orig);
[n,p]=size(orig);
thetab=zeros(B,1);
sdstar=zeros(B,1);
thetas=zeros(B,1);

for b=1:B
indices=randint(1,n,n)+1;
samp=orig(indices,:);
thetab(b)=feval(theta,samp);
%Compute the bootstrap se,se*
sdstar(b)=feval('btsd',samp,theta,n,sdB);

%Studentized statistic
thetas(b)=(thetab(b)-theta0)/sdstar(b);
end
se=sqrt(var(thetab));
Pct=100*(alpha/2);
lower=prctile(thetas,Pct);
upper=prctile(thetas,100-Pct);
interv=[(theta0-upper*se) (theta0 - lower*se)];
%----------------------------------------------
function out=btsd(orig,theta,n,NB)
%Compute the bootstrap estimate
%of the stad error of the estimator
%defined by the function theta
%NB number of bootstrap simulations
thetab=zeros(NB,1);
for b=(1:NB)
indices=randint(1,n,n)+1;
samp=orig(indices,:);
thetab(b)=feval(theta,samp);
end
out=sqrt(var(thetab));
%----------------------------------------------
>> boott(law15,'corr',1000,30,.05)
ans =
-0.4737    1.0137
%----------------------------------------------
>> boott(law15,'corr',2000,30,.05)
ans =
-0.2899    0.9801
%----------------------

## Transformations of the parameter

function out=transcorr(orig)
%transformed correlation coefficient
c1=corrcoef(orig);
rho=c1(1,2);
out=.5*log((1+rho)/(1-rho));
>> transcorr(law15)
ans =
1.0362
>> tanh(1.03)
ans =
0.7739
>> boott(law15,'transcorr',100,30,.05)
ans =
-0.7841    1.7940
>> tanh(ans)
ans =
-0.6550    0.9462
>> boott(law15,'transcorr',1000,30,.05)
ans =
0.0473    1.7618
>> tanh(ans)
ans =
0.0473    0.9427
>> transcorr(law15)
ans =
1.0362
>> 2/sqrt(12)
ans =
0.5774
>> 1.0362 - 0.5774
ans =
0.4588
>> 1.0362 + 0.5774
ans =
1.6136
>> tanh([.4588 1.6136])
ans =
0.4291    0.9237
%%%%%%%%%%%%%%%True confidence Intervals%
>> prctile(res100k,[5 95])
ans =
0.5307    0.9041

### The delta method

Very often we have a new random variable function of one or several other random variables, and we want to find the expectation and variance of if we know that of . For a linear function this is easy, the next best thing is to give the best linear approximation to and this is done through the delta method.

#### One dimension

We use a first order Taylor expansion of Y around

Thus

we know this is not true unless g is linear, using the Taylor expansion to second order:

Taking expectations we get

Susan Holmes 2004-05-19