## Advanced Statistics - Biology 6030 |

## Bowling Green State University, Fall 2017 |

Let's say you have obtained a given sample statistic on which you are ready to declare your conclusion about the validity of the null hypothesis. Before you do that, though, would it not be nice if you could assess the chances of getting a result like this again had you done the experiment a second time. What would the variation in this sample statistic look like had you done this experiment many times over. A set of methods, called resampling methods, may be suitable to give you a clue about how sensitive the sample statistic is with respect to the very

**Estimate the precision of sample statistics**, such as medians, variances, or percentiles, by using subsets of available data**Perform significance tests**by randomly exchanging labels on data points

**Validate models**by using random subsets

**Jackknifing**- Indentical calculations are performed on a series of subsets which are obtained from the available data by droping another single data point in each (e.g., Cluster Analysis).**Bootstrapping**- Indentical calculations are performed on a series of data sets of the same size as the original by drawing items randomly with replacement (e.g., Phylogenetic trees)

**Permutation test**- Indentical calculations are performed on a series of data sets in which the assignment of different items to a particular treatment group has been randomized. If an effect is present and changing the order of the data destroys this effect, then the statistic with the actual data is unusual relative to the distribution of the statistic for permuted data (e.g., Mantel's Matrix Correlation).

**Cross validation**- A calculation is obtained for a subsets of the data only. The remainder are held out, to be used to validate the results (e.g. Discriminant Function Analysis). If a model is fit to the remaining data (a training set) and used to predict for the validation set then averaging the quality of the predictions across the validation sets yields an overall measure of prediction accuracy.

last modified: 4/16/14

[ Advanced Statistics Course page | About BIO 6030 | Announcements ]

[ Course syllabus | Exams & Grading | Glossary | Evaluations | Links ]