Hypothesis testing
Exploring Data and setting up for an analysis
Objectives
-Question formulation
-Summarize: Weighing the Pig
-Variables and graphing
-“Analysis” versus “EDA”
-Statistical Analysis Plan: the concept
Question formulation and hypothesis testing
-“population of interest”
-samples and sampling
-test statistics
-null hypothesis
-Let’s talk about the P-value
Hypotheses: Null vs Alternative
Null Hypothesis (H₀):
The assumption that there is no effect or no difference.
Example: “There is no difference in test scores between two groups.”Alternative Hypothesis (H₁ or Ha):
The assumption that there is an effect or a difference.
Example: “There is a difference in test scores between two groups.”
P-value: Definition and Interpretation
P-value:
The probability of obtaining results at least as extreme as the ones observed, under the assumption that the null hypothesis is true.Interpretation:
A small p-value (typically < 0.05) suggests that the observed data is unlikely under the null hypothesis, leading to rejection of H₀.
A large p-value suggests that the data is consistent with the null hypothesis.
Question formulation and hypothesis testing
Benefits of NHST
-Familiar and acceptable to researchers
-Typically robust to assumptions
-Strong framework for evidence
-The basic idea is simple
Question formulation and hypothesis testing
Criticism of NHST
-Often interpreted under error
-Validation of analysis often neglected
-Education often deficient
-Practitioners ignorant of subtleties
Summarize: Weighing the Pig
Chick weight dataset
The hypothesis voices “how you think the world works” or what you predict to be true”
coding
Variables and graphing

Variables and graphing
-Must convey relevant information
-Consistent in aesthetics
-Self-contained
-Reflect hypothesis (unless descriptive)
-Appropriate to data
Variables and graphing
Concept of “layering” in building graphs
coding
“Analysis” versus “EDA”
EDA:
-Informal, haphazard
-Gain data understanding
-Test assumptions
-Usually not for “others”
-Usually occurs before analysis
“Analysis” versus “EDA”
Analysis:
-Designed to fit hypothesis
-For presentation to others
-Creation of EVIDENCE to support CLAIMS
-Reproducible
Statistical Analysis Plan: the concept
Prior to data collection
-formally state hypothesis
-State specific statistical model(s)
-Specify data and data collection
-State and justify sample size
Taught to children

Best practice

Practice Exercises