## Advanced Statistics - Biology 6030 |

## Bowling Green State University, Fall 2017 |

Family of Distributions with the same general shape (i.e., Gaussian distributions, bell curves) which feature a single peak at the precise center of the distribution, are symmetrically concentrated in the middle and decrease the further you go into the tails. They reflect a binomial distribution at very large sample sizes. Normal distributions are centered on the population mean (m) and dispersed around it with a given population variance (s^{2}). A **Standard Normal Distribution** is one that has the following parameters (m=0, s^{2}=1, g_{1}=0, g_{2}=0). Any general normal distribution can be converted to a **Standard Normal Distribution** (m=0, s^{2}=1) using the **Standard Normal Deviate** () or **z-Score**.

- Measures of
**Central Tendency****Mean**(arithmetic, 1st moment): sum of values divided by the number of values**Median**: midpoint of values after they have been arranged from highest to lowest (i.e., 50th percentile)**Mode**: midpoint of class interval with largest frequency (i.e., most common value and thus the highest point in a distribution). Suitable for all types of data including nominal but examine data for bi- or multimodality

- Measures of
**Dispersion****Sum of squares****Mean squares (Variance)****Standard Deviation**(2nd moment)**R****ange**and**Interquartile Range**(i.e., difference between the 75th and 25th percentile is an underutilized, stable measure of dispersion)

- Measures of
**Asymmetry****Skewness**(g_{1}, 3rd moment): Exactly one half of all measures lies above teh mean, the other half is below. positive, negative skewness. Be concerned about skewness and kurtosis values >1 or <-1.

- Measures of
**Peakedness****Kurtosis**(g_{2}, 4th moment): leptokurtic - high-peaked; mesokurtic - normal; platykurtic - flat-topped

**z-Tables**list the area under the probability density function for a standard normal distribution**Central limit theorem**: explains why many distributions tend towards normality when the random variable being observed is the sum or mean of many independent identically distributed random variables.

- If multiple samples are obtained from a population their means will generally be normally distributed around the true underlying population mean (
*m*). According to the**Central limit theorem**they will be distributed normally around it. Moreover, this is true regardless of the shape of the population from which items are sampled. The distribution of sample means approaches a normal probability distribution when sample size is sufficiently large (N >= 30). **Standard Error of the Mean**: Standard deviation for multiple sample means drawn from a particular population. Your confidence in how close your sample mean is to the underlying population mean varies with the sample's standard deviation and inversely with the sample size. SE = Var/N or SE = SD/SQRT(N);**Confidence intervals**: Range within which the population parameter is expected to fall for a given level of confidence- individual measures (e.g., 95% µ ± 1.96
*s*; 99% µ ± 2.58*s*). Plug in your sample estimates for mean and standard deviation - sample means (SE = SD/SQRT(N); 95% µ ± 1.96 SE; 99% µ ± 2.58 SE)

- individual measures (e.g., 95% µ ± 1.96
- Compare a Sample Mean to a population mean in order to judge how likely it was derived from it.
- Paired t-Test: Calculate the distribution of differences between the paired measures and compare that distribution to a population mean of 0
**1 sample t-test**:*t*=**2 sample t-test**:**Additional graphics****Estimated Probability Density Function****Quantile Box Plot****Normal Quantile Plot**

last modified: 2/3/15

[ Advanced Statistics Course page | About BIO 6030 | Announcements ]

[ Course syllabus | Exams & Grading | Glossary | Evaluations | Links ]