The ability to quantitatively evaluate one’s data is increasingly important in scientific

research. Yet many entering PhD students lack a fundamental understanding of the

statistical principles and basic programming skills that can accelerate and empower

data analysis. This one-credit course is a primer on fundamental statistical and

computational skills and concepts for first-year DBBS students; it assumes no prior

experience in statistics or programming. The course will cover common statistical

practices and concepts in the life sciences, such as error bars, summary statistics,

probability and distributions, and hypothesis testing. In parallel, the class will also teach

students programming skills for basic statistical computation.

The course format emphasizes practical problem-solving skills by teaching both core

statistical concepts and computational methods to implement them. The course will

introduce students to the Python programming language and key Python statistical and

plotting tools. Upon completing the course, students will be able to retrieve and analyze

simple and genomic-style datasets from online databases, write simple data analysis

scripts in Python, create the major types of statistical plots, and critically evaluate how

best to assess the significance of and summarize their data.

2016-09-27

Lecture 0 (Computation): Introduction to IPython Notebook

– Installing IPython Notebook

– Basic Python Usage

2016-10-04

Lecture 1 (Statistics): Summarizing Numbers

– Single number summaries: mean, median and mode

– Two numbers: variance and standard deviation

– Dot plots and histograms

– Distributions

2016-10-11

Lecture 2 (Computation): Basic Python with Genomic Data

– Obtaining genomic data from online databases

– Methods to import data into IPython Notebook

– Basic Python Syntax

– Python data types and structures

2016-10-14: PS1 assignment due (Thursday)

2016-10-18

Lecture 3 (Statistics): Basic Probability

– Intuitive probability estimation from histograms

– Basic theory and notation

– How probabilities combine: “and” and “or”

– Independence and conditional probability

– Counting successes and failures

2016-10-20: PS2 assignment due (Thursday)

2016-10-25

Lecture 4 (Computation): Python for Data Analysis

– Python Syntax

– Python data types and structures

– Using Python to examine datasets

2016-10-27: PS3 assignment due (Thursday)

2016-11-01

Lecture 5 (Computation): Using Python Libraries: Plotting

– Python libraries and functions

– Plot types and Python plotting functions

– Writing Python functions

2016-11-03 PS4 assignment due (Thursday)

2016-11-08

Lecture 6 (Statistics): Simulation and Hypothesis Testing (I)

– Why simulate?

– Hypothesis testing and the null distribution

– What p-values are and are not

– Recent controversies in the use of p-values

2016-11-10 PS5 assignment due (Thursday)

2016-11-15

Lecture 7 (Computation): Simulation

– Advanced Python functions

– Random sampling and simulation

2016-11-19: PS6 assignment due (Saturday, 11:59 PM)

2016-11-29

Lecture 8 (Statistics): Simulation and Hypothesis Testing (II)

– Permutation testing

– Sampling from a population

– Bootstrap confidence intervals

– Bootstrap hypothesis testing

2016-12-06

Lecture 9 (Computation): Statistics in Python

– Bootstrap testing

– Python statistics libraries

2016-12-11: PS8 assignment due (Sunday, 11:59 pm)

2016-12-13

Lecture 10 (Statistics): Power Analysis, Experimental Design, and Parametric Statistics

– Statistical Power

– Paired tests

– The standard error and the t-test

– ANOVA

Designed by Elegant Themes | Powered by Wordpress