Download Datasets and Presentation slides for this post HERE

Sampling in Python is the cornerstone of inference statistics and hypothesis testing. It's a powerful skill used in survey analysis and experimental design to draw conclusions without surveying an entire population. In this Sampling in Python course, you’ll discover when to use sampling and how to perform common types of sampling—from simple random sampling to more complex methods like stratified and cluster sampling. Using real-world datasets, including coffee ratings, Spotify songs, and employee attrition, you’ll learn to estimate population statistics and quantify uncertainty in your estimates by generating sampling distributions and bootstrap distributions.

import pandas as pd
import numpy as np
import warnings
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [8, 6]

pd.set_option('display.expand_frame_repr', False)

warnings.filterwarnings("ignore", category=DeprecationWarning)
warnings.filterwarnings("ignore", category=FutureWarning)

Bias Any Stretch of the Imagination

Learn what sampling is and why it is so powerful. You’ll also learn about the problems caused by convenience sampling and the differences between true randomness and pseudo-randomness.

Living the sample life

Reasons for sampling Simple sampling with pandas Simple sampling and calculating with NumPy

A little too convenient

Are the findings from this sample generalizable? Are these findings generalizable?

How does Sue do sampling?

Generating random numbers Understanding random seeds

Don't get theory eyed

It’s time to get hands-on and perform the four random sampling methods in Python:simple, systematic, stratified, and cluster.

Simple is as simple does

Simple random sampling Systematic sampling Is systematic sampling OK?

Can't get no stratisfaction

Which sampling method? Proportional stratified sampling Equal counts stratified sampling Weighted sampling

What a cluster...

Benefits of clustering Cluster sampling

Straight to the point (estimate)

3 kinds of sampling Comparing point estimates

The n's justify the means

Let’s test your sampling. In this chapter, you’ll discover how to quantify the accuracy of sample statistics using relative errors, and measure variation in your estimates by generating sampling distributions.

An ample sample

Calculating relative errors Relative error vs. sample size

Baby back dist-rib-ution

Replicating samples Replication parameters

Be our guess, put our samples to the test

Exact sampling distribution Approximate sampling distribution Exact vs. approximate

Err on the side of Gaussian

Population & sampling distribution means Population & sampling distribution variation

Pull Your Data Up By Its Bootstraps

You’ll get to grips with resampling to perform bootstrapping and estimate variation in an unknown population. You’ll learn the difference between sampling distributions and bootstrap distributions using resampling.

This bears a striking resample-lance

Principles of bootstrapping With or without replacement? Generating a bootstrap distribution

A breath of fresh error

Bootstrap statistics and population statistics Sampling distribution vs. bootstrap distribution Compare sampling and bootstrap means Compare sampling and bootstrap standard deviations

Venus infers

Confidence interval interpretation Calculating confidence intervals

Bias Any Stretch of the Imagination

Living the sample life

A little too convenient

How does Sue do sampling?

Don't get theory eyed

Simple is as simple does

Can't get no stratisfaction

What a cluster...

Straight to the point (estimate)

The n's justify the means

An ample sample

Baby back dist-rib-ution

Be our guess, put our samples to the test

Err on the side of Gaussian

Pull Your Data Up By Its Bootstraps

This bears a striking resample-lance

A breath of fresh error

Venus infers

Congratulations