Pyabtest

A simple tool to calculate P-value after conducting an A/B experiment

A/B experiment & Hypothesis testing

Normally we run an A/B experiment to see whether a new model brings some improvement in the production metrics. After running the experiment for a fixed time period, we use hypothesis testing to scientifically come to a conclusion whether to accept the new feature or not. Usually, hypothesis testing has following components:

Null hypothesis: New model does not bring any improvement Alternative hypothesis: New model does bring some improvement

This tool will be useful to calculate P-value to check whether we can reject the null hypothesis or not.

Installation

Use the package manager pip to install pyabtest

pip install pyabtest

Usage

Following functionalities are exposed in this package

1. Test for Sample Ratio Mismatch (SRM)

This is a test to check whether we have created audience for control vs test in a truly random manner. If there is an SRM, we should discard the A/B test results as control and variant have different type of audience. For example, we can pass following numbers in control vs test to check for SRM.

Number of male vs Number of female
Number of users of age < 40 vs Number of users of age >= 40
Number of active users vs Number of inactive users
Number of english speaking users vs Number of non-english speaking users
Number of mobile users vs Number of desktop users

Input: Control group 1 size, Control group 2 size, Variant group 1 size, Variant group 2 size

Output: P-value, Alpha, Decision

>>> import pyabtest
>>> pyabtest.test_for_sample_ratio_mismatch(control_group1_size=1000,control_group2_size=2000,variant_group1_size=
1010,variant_group2_size=1990,alpha=0.05)
{'P-value': 0.78445, 'Alpha value (significance level)': 0.05, 'Decision': "Don't discard A/B test results"}

Test used: Chi-squared Test

2. Test for Binary Metric

This test can be used when when the result/action/feedback is binary & we want to see if variant observations are coming from a differant population when compared to control. For example, this test can be used in the following situations:

Clicks vs No clicks
Cart vs No cart
Order vs No order
Number of zero search results vs Number of non-zero serach results
Number of successful sessions vs Number of non-successful sessions
Number of positive reviews vs Number of negative reviews
Number of converted users vs Number of non-converted users

Input: No. of success in Control, No. of failures in Control, No. of success in variant, No. of failures in variant

Output: P-value, Alpha, Decision

>>> import pyabtest
>>> pyabtest.test_for_binary_metric(control_success=50, control_failures=1000, variant_success=40, variant_failures=900, alpha=0.05)
{'P-value': 0.58718, 'Alpha value (significance level)': 0.05, 'Decision': 'Do not reject null hypothesis'}

Test used: Chi-squared Test

3. Test for Numeric Metric

This test can be used for any generic numeric metric (Count or Fraction). We can use this test even if the observations do not follow a normal distribution. In general, this test does not assume anything about the distribution as it is a non-parametric test. Example metrics include:

Number of clicks per unique user
Number of carts per unique user
Number of orders per unique user
Clicks/Views per unique user
Orders/Views per unique user
Orders/Session per unique user
Revenue per unique user
Session time per unique user
Order value per unique user
Successful sessions per unique user

Input: Control array (Ex: Array containing no. of clicks for each user in control, order does not matter), Variant array (Ex: Array containing no. of clicks for each user in variant, order does not matter)

Output: P-value, Alpha, Decision

>>> import pyabtest
>>> from numpy import random
>>> random.seed(12)
>>> pyabtest.test_for_numeric_metric(control_observations=random.randint(100, size=(20)), variant_observations=random.randint(100, size=(20)), test_type="ttest", alpha=0.05)
{'P-value': 0.18443, 'Alpha value (significance level)': 0.05, 'Decision': 'Do not reject null hypothesis'}

Test used by default: Welch's t-test. Pass test_type="bootstrap" if you want to use Bootstrap test instead of Welch's t-test.

License

MIT

References

Hypothesis testing
Chi-squared test
Welch's t-test
Bootstrap test1, Bootstrap test2

Author

Rama Badrinath

Email: ramab1988@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.github/workflows		.github/workflows
src/pyabtest		src/pyabtest
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pyabtest

A/B experiment & Hypothesis testing

Installation

Usage

1. Test for Sample Ratio Mismatch (SRM)

2. Test for Binary Metric

3. Test for Numeric Metric

License

References

Author

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ramab1988/pyabtest

Folders and files

Latest commit

History

Repository files navigation

Pyabtest

A/B experiment & Hypothesis testing

Installation

Usage

1. Test for Sample Ratio Mismatch (SRM)

2. Test for Binary Metric

3. Test for Numeric Metric

License

References

Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages