Analysis Frequentist Approach

This tutorial shows how to perform post-test analysis of an A/B test experiment with two variants, so called control and treatment groups, using frequentist statistics. It handles both the case of means comparison and conversions comparison with closed-form-solutions. It assumes that sample data are normally distributed.

Let’s import first the tools needed.

[1]:
import numpy as np
from abexp.core.analysis_frequentist import FrequentistAnalyzer
from abexp.visualization.analysis_plots import AnalysisPlot

Compare means

Here we want to compare the mean of the control group versus the mean of the treatment group given the sample observations.

[2]:
# Define the analyzer
analyzer = FrequentistAnalyzer()

We will compare the average revenue per user of the control group versus the treatment group, making separate analysis for standard and premium users.

[3]:
# Revenue for standard users
np.random.seed(42)
revenueS_contr = np.random.normal(270, 200, 1000)
revenueS_treat = np.random.normal(300, 200, 1000)

# Revenue for premium users
revenueP_contr = np.random.normal(300, 200, 1000)
revenueP_treat = np.random.normal(310, 200, 1000)
[4]:
pval_S, ciS_contr, ciS_treat =  analyzer.compare_mean_obs(obs_contr=revenueS_contr,
                                                          obs_treat=revenueS_treat,
                                                          alpha=0.05)

pval_P, ciP_contr, ciP_treat =  analyzer.compare_mean_obs(obs_contr=revenueP_contr,
                                                          obs_treat=revenueP_treat,
                                                          alpha=0.05)
[5]:
print('Standard users: p-value = {:.6f}'.format(pval_S))
print('Premium  users: p-value = {:.6f}'.format(pval_P))
Standard users: p-value = 0.000005
Premium  users: p-value = 0.571544

If p-value \(\leq\)0.05 the test result is statistically significant. There is a significative difference between control and treatment groups.

Otherwise if p-value \(>\) 0.05 the test result is not statistically significant. There is not a statistical significant difference between control and treatment groups.

[6]:
# Computer groups mean
meanS_contr = np.mean(revenueS_contr)
meanS_treat = np.mean(revenueS_treat)
meanP_contr = np.mean(revenueP_contr)
meanP_treat = np.mean(revenueP_treat)

Display test results in barplots.

[42]:
# Define height of the control group bars
bars_contr = [meanS_contr, meanP_contr]

# Define height of the treatment group bars
bars_treat = [meanS_treat, meanP_treat]

# Define upper and lower limit of the error bars for the control group
ci_contr = [[ciS_contr[0], ciP_contr[0]],  #  2.5 percetiles
            [ciS_contr[1], ciP_contr[1]]]  # 97.5 percentiles

# Define upper and lower limit of the error bars for the treatment group
ci_treat = [[ciS_treat[0], ciP_treat[0]],  #  2.5 percetiles
            [ciS_treat[1], ciP_treat[1]]]  # 97.5 percentiles

bars = [bars_contr, bars_treat]
ci = [ci_contr, ci_treat]

fig = AnalysisPlot.barplot(bars, ci, title='Barplot',
                           ylabel='average revenue per user',
                           xlabel=['standard', 'premium'],
                           groupslabel=['control', 'treatment'])
../_images/tutorials_AnalysisFrequentistApproach_13_0.png

Compare conversions

Here we want to compare the number of user that made a purchase in the control group versus the treatment group.

[8]:
# Number of users that made a purchase
purchase_contr = 400
purchase_treat = 470

# Total number of users
total_usr_treat = 5000
total_usr_contr = 5000
[9]:
p_val, ci_contr, ci_treat = analyzer.compare_conv_stats(conv_contr=purchase_contr,
                                                        conv_treat=purchase_treat,
                                                        nobs_contr=total_usr_treat,
                                                        nobs_treat=total_usr_contr)
[10]:
print('p-value = {:.6f}'.format(p_val))
p-value = 0.013002

In this case p-value \(\leq\)0.05, the test result is statistically significant. There is a significative difference between control and treatment groups. The treatment applied on the test group was successful.