Analysis Bootstrap

This tutorial shows how to perform analysis after A/B test experiments using bootstrapping. This technique makes inference about a certain estimate (e.g. sample mean) for a certain population parameter (e.g. population mean) by resampling with replacement from the observed dataset. It does not make any assumption on the samples distribution.

Let’s import first the tools needed.

[1]:
import numpy as np
import pandas as pd
from abexp.core.analysis_frequentist import FrequentistAnalyzer
from abexp.visualization.analysis_plots import AnalysisPlot

Simple bootstrap

Here we want to compare a specific metrics of the control group versus the treatment group (e.g. average revenue per user). We will perform bootstrapping on the kpi metrics (revenue) of each group.

[2]:
# Generate random data for revenue control group
revenue_contr = np.random.randint(low=50, high=500, size=100)

# Generate random data for revenue treatment group
revenue_treat = np.random.randint(low=50, high=600, size=100)
[3]:
# Define the analyzer
analyzer = FrequentistAnalyzer()
[4]:
# Define the aggregation function that will be applied on the sample
aggregation_func = np.mean

# other possibles aggregation functions might be:
#  - standard deviation = np.std,
#  - sum                = np.sum
#  - median             = lambda x: np.median(x, axis=0)

Bootstrapping will generate a sequence of N values (where N is the number of repetitions). The bootstrap function returns a table with the median, 2.5 percentile and 97.5 percentile of this sequence.

[5]:
# Perform bootstrapping on the control group
stats_contr = analyzer.bootstrap(revenue_contr, func=aggregation_func, rep=500)
stats_contr
[5]:
median 2.5 percentile 97.5 percentile
282.475 255.092 307.1285
[6]:
# Perform bootstrapping on the treatment group
stats_treat = analyzer.bootstrap(revenue_treat, func=aggregation_func, rep=500)
stats_treat
[6]:
median 2.5 percentile 97.5 percentile
305.5 276.192 339.1205
[7]:
# Define heights of the bars
bars = [stats_contr['median'], stats_treat['median']]

# Compute the error between median and percentiles
ci_contr = [stats_contr['2.5 percentile'],
            stats_contr['97.5 percentile']]

ci_treat = [stats_treat['2.5 percentile'],
            stats_treat['97.5 percentile']]
[8]:
# Plot results with confidence interval
fig = AnalysisPlot.barplot(bars, [ci_contr, ci_treat],
                           groupslabel=['control group', 'treatment group'],
                           ylabel='average revenue per user', xlabel='')
../_images/tutorials_AnalysisBootstrap_12_0.png

In the barplot above we see that there is no difference between empirical means because the confidence intervals overlap.

Time series bootstrap

Here we want to compare a specific metrics of the control group versus the treatment group (e.g. average revenue per user) across time. We will perform bootstrapping on the kpi metrics (revenue) of each group per each day. Note that the bootstrap function maintains the correlation across days.

[9]:
# Generate random data for revenue control group
revenue_contr_ts = pd.DataFrame({'day1': np.random.randint(low=1, high=500, size=1000),
                                 'day2': np.random.randint(low=1, high=500, size=1000),
                                 'day3': np.random.randint(low=1, high=500, size=1000),
                                 'day4': np.random.randint(low=1, high=500, size=1000),
                                 'day5': np.random.randint(low=1, high=500, size=1000),
                                 'day6': np.random.randint(low=1, high=500, size=1000),
                                 'day7': np.random.randint(low=1, high=500, size=1000)})

# Generate random data for revenue treatment group
revenue_treat_ts = pd.DataFrame({'day1': np.random.randint(low=1, high=600, size=1000),
                                 'day2': np.random.randint(low=1, high=600, size=1000),
                                 'day3': np.random.randint(low=1, high=600, size=1000),
                                 'day4': np.random.randint(low=1, high=600, size=1000),
                                 'day5': np.random.randint(low=1, high=600, size=1000),
                                 'day6': np.random.randint(low=1, high=600, size=1000),
                                 'day7': np.random.randint(low=1, high=600, size=1000)})
[10]:
# Perform bootstrapping on the control group
stats_contr_ts = analyzer.bootstrap(revenue_contr_ts, func=aggregation_func, rep=500)
stats_contr_ts
[10]:
median 2.5 percentile 97.5 percentile
day1 246.6630 237.597475 255.912150
day2 248.0410 239.164575 256.451925
day3 250.4535 241.068275 259.431525
day4 252.0625 244.145850 261.326350
day5 246.4465 237.647800 255.376150
day6 252.0445 243.933075 261.501475
day7 249.0605 240.451950 257.918600
[11]:
# Perform bootstrapping on the treatment group
stats_treat_ts = analyzer.bootstrap(revenue_treat_ts, func=aggregation_func, rep=500)
stats_treat_ts
[11]:
median 2.5 percentile 97.5 percentile
day1 305.8540 295.303525 315.689025
day2 297.1785 287.122225 308.437950
day3 311.1690 300.329400 322.258075
day4 297.0245 286.180500 307.432525
day5 302.7850 292.530300 313.642875
day6 300.4425 289.563675 311.556775
day7 299.9155 288.822375 310.642600

Plot results with confidence intervals

[12]:
# Define heights of the bars
y = [stats_contr_ts['median'], stats_treat_ts['median']]

# Compute the error between median and percentiles
ci_treat_ts = [stats_treat_ts['median'] - stats_treat_ts['2.5 percentile'],
               stats_treat_ts['97.5 percentile'] - stats_treat_ts['median']]
ci_contr_ts = [stats_contr_ts['median'] - stats_contr_ts['2.5 percentile'],
               stats_contr_ts['97.5 percentile'] - stats_contr_ts['median']]
[13]:
fig = AnalysisPlot.timeseries_plot(y, [ci_contr_ts, ci_treat_ts])
../_images/tutorials_AnalysisBootstrap_21_0.png