- Assignments
- Projects

- Teaching
- Course Web Site

MWT: Sequoia 200 11-12.

and two bigger projects (midterm and final) 60 %

The main component of the projects will be matlab, or R/Splus functions that perform certain analyses and produce graphics, these functions should be emailed to me and hardcopies sent to the TA's.

- First Part due Monday, May, 3rd to be handed in in class.
- Completed project: due Friday, June 4, 2004 at 12:00 pm

Typically the project can be one of the following types, or a combination of elements from each.

- A case study, if you have some original data, and a statistical problem that you want to solve with the bootstrap.
- A comparative study, you would like to compare performance of the bootstrap with other methods in different situations.
- Implementation of a new computational procedure,for instance, you could try and write a gray code program with a clever update for other statistics than the correlation coefficient, or use a fancy variance-reducing technique for the Monte Carlo step, or improve on the empirical distribution as the estimator of .
- A theoretical study on how fast the bootstrap estimate converges, and how to improve it.

Some projects will involve considerably more effort than others, and thus have greater potential to earn an outstanding grade. While the complete project counts for 60% of your final grade, the biggest payoff of a more challenging project is in the opportunity it provides you to solidify and extend your understanding of the material in the course and to obtain practical experience in applying it to your own research concerns. The first part will count 20%, the second, 40%.

You may want to do a project using data you have from another course (whether from an experiment or through access to a data set somebody else has collected). This may be a good way to apply statistics to something you have thought about. If you do a project of this sort, you must make very clear which part of the project is done specifically for your statistics course and which part is just a review or copy of work you have done for another course.

- A simple and clear exposition of the question you are addressing.
- A situation of the problem in the wider context of contemporary statistics, with a review of available methods for such data and a few words on their advantages and disadvantages.
- A proposed solution to the problem using either the bootstrap or another resampling procedure, with comparative merits of the bootstrap as proposed to other methods.
- A flow chart of the various tasks to be undertaken, programming, testing the program on simulated data, testing the program on real data are all reasonable steps.

- A theoretical part: explanation of the method studied, its properties.
- A computational part: an algorithm for implementing the method in matlab or S-plus, this should also be emailed to the TAs so it can be tested. Make sure your code is readable, so we can eventually do a little trouble shooting if necessary.
- A data-analysis part: actual data are to be
submitted to the method studied, or tables should show comparisons,
or theoretical results should be outlined.
**Analysis of a data set with your algorithm:**Perform a statistical analysis of some data set from an experiment, survey, or secondary data source using Matlab Splus. You should pay critical attention to issues concerning how the data were collected as well as to the statistical analysis. (Depending on the nature of the data and your own relationship to it, you may want to give more or less emphasis to explanation of the data collection.) You should make sure that your data set has enough complexity (more than just a couple of variables, and a decent number of observations) to support an interesting analysis.

Some ideas according to your area of expertise:

- Education, Psychology, Social Scientists: Methods such as regression analysis, multivariate analyses, clustering can be bootstrapped usefully.
- Biology : Analysis of DNA : distances, phylogenies are bootstrapped alot.
- Econometry : Time Series Data need special treatment because of the underlying dependence.

*Human Development Report*, published annually for the United Nations Development Program. There are a number of other statistical reports from the UN and other international agencies like the International Labor Organization.*Statistical Abstract of the United States*. Full of all sorts of statistical tables.*On the Net(see below), for instance the `Chance' project of Laurie Snell is very interesting.*

*Data: a Collection of Problems from Many Fields for the Student and Research Worker*, by D.F. Andrews and A.M. Herzberg*Case Studies in Biometry*, edited by Lange, Ryan, Billard, Brillinger, Conquest and Greenhouse

*Population Studies**Chance*(a popularly-oriented statistics magazine)*Ecology*(particularly Volume 74, No. 6, a special issue on statistical methods)*Journal of Experimental Zoology**New England Journal of Medicine**Public Opinion Quarterly**Journal of Applied Psychology**Proceedings of National Academy of Sciences,section Evolution*

**TA's ** Brit Katzen and Jie-Hua Chen

TA's office hours:

Brit Katzen (Sequoia 229) : Wed. 2:15 - 3:45

Jie-Hua (Sequoia 141) :Thur 4.-5

Address:`http://www-stat.stanford.edu/~susan/courses/s208/`

Weekly consultation of the web site will be necessary and expected of all students.