Sample Size Determination¶
This tutorial shows how to compute the minimum sample size needed for an A/B test experiment with two variants, so called control and treatment groups. This problem is usually referred as Sample Size Determination (SSD).
Let’s import first the tools needed.
[1]:
from abexp.core.design import SampleSize
Formulate hp #1
Which kind of A/B experiment do you intend to run?
Compare means: the experiment aims to compare the mean of a certain metrics in the control group versus the treatment group. This metrics is a continuous variable and it represents the kpi of the experiment, e.g. revenue.
Compare proportions: the experiment aims to compare the proportion/probability of a certain metrics the control group versus the treatment group. This metrics represents the kpi of the experiment, e.g. %churners, probability of having premium users.
Compare means¶
Formulate hp #2
Here you need to define the desirable minimum delta between control and treatment groups:
What is the mean of the control group?
What is the standard deviation of the control group?
What is the desirable/expected mean of the treatment group?
Define these according to your domain expertise. Please formulate reasonable values that you expect see at the end of the experiment (after that the treatment will be applied to the treatment group).
Compute sample size
[2]:
sample_size = SampleSize.ssd_mean(mean_contr=790, mean_treat=800, std_contr=200, alpha=0.05, power=0.8)
print('Minimum sample size per each group = {}'.format(sample_size))
Minimum sample size per each group = 6280
Compare proportions¶
Formulate hp #2
Here you need to define the desirable minimum delta between control and treatment groups:
What is the proportion in the control group?
What is the desirable/expected proportion in the treatment group?
Define these according to your domain expertise. Please formulate reasonable values that you expect see at the end of the experiment (after that the treatment will be applied to the treatment group).
Compute sample size
[3]:
sample_size = SampleSize.ssd_prop(prop_contr=0.33, prop_treat=0.31, alpha=0.05, power=0.8)
print('Minimum sample size per each group = {}'.format(sample_size))
Minimum sample size per each group = 8538
Statistics behind¶
abexp
masks the statistical techniques applied in the background. Sample Size Determination is achieved via power analysis. Given the values of the three parameters below, it estimate the minimum sample size required:
significance level, default 0.05
power, default 0.80
estimation of the desirable minimum effect size, specific to the experiment
The statistical tests used in this context are respectively t-test to compare means and z-test to compare proportions.
Notes
alpha
andpower
are respectively set to 0.05 and 0.8, which are the suggested default values. Be careful if you want to change them.Power analysis is valid on the assumption that sample data are normally distributed.