How to do experiment via Limit Distribution? (ABn Test)

[1]:
from scipy.stats import bernoulli

from lightautoml.addons.hypex.abn_test import min_sample_size
from lightautoml.addons.hypex.abn_test import test_on_marginal_distribution
import numpy as np

Initialize random state

[2]:
seed = 42  # You can choose any number as the seed
random_state = np.random.RandomState(seed)

Multiple testing for best sample selection

Number of samples and parameters

[3]:
num_samples = 10  # Number of samples
minimum_detectable_effect = 0.05  # MDE
assumed_conversion = 0.3  # Assumed conversion rate
significance_level = 0.05  # Significance level
power_level = 0.2  # Power level (1 - beta)

Calculate the minimum sample size

[4]:
sample_size = min_sample_size(
    num_samples,
    minimum_detectable_effect,
    variances=assumed_conversion * (1 - assumed_conversion),
    significance_level=significance_level,
    power_level=power_level,
    equal_variance=True,
)
print(f"Sample size = {sample_size}")
Sample size = 1313

Testing samples with equal conversion rate

[5]:
print("\nSamples with equal conversion rate")
for _ in range(5):
    samples = bernoulli.rvs(
        assumed_conversion, size=[num_samples, sample_size], random_state=random_state
    )
    hypothesis = test_on_marginal_distribution(
        samples, significance_level=significance_level
    )
    print(f"\tAccepted hypothesis H({hypothesis})")

Samples with equal conversion rate
        Accepted hypothesis H(0)
        Accepted hypothesis H(0)
        Accepted hypothesis H(0)
        Accepted hypothesis H(0)
        Accepted hypothesis H(0)

Testing where the last sample has a higher conversion rate by MDE

[6]:
print("\nLast sample has higher conversion by MDE")
for _ in range(5):
    samples = [
        bernoulli.rvs(assumed_conversion, size=sample_size, random_state=random_state)
        for _ in range(num_samples - 1)
    ]
    samples.append(
        bernoulli.rvs(
            assumed_conversion + minimum_detectable_effect,
            size=sample_size,
            random_state=random_state,
        )
    )
    hypothesis = test_on_marginal_distribution(
        samples, significance_level=significance_level
    )
    print(f"\tAccepted hypothesis H({hypothesis})")

Last sample has higher conversion by MDE
        Accepted hypothesis H(10)
        Accepted hypothesis H(10)
        Accepted hypothesis H(10)
        Accepted hypothesis H(10)
        Accepted hypothesis H(10)

Multiple testing for best client income sample (conversion * price)

Parameters for different samples

[7]:
num_samples = 5  # Number of samples
minimum_detectable_effect = 2.5  # MDE
prices = [100, 150, 150, 200, 250]  # Tariff prices
conversions = [0.15, 0.1, 0.1, 0.075, 0.06]  # Tariff conversions
significance_level = 0.05
power_level = 0.2
variances = [
    price ** 2 * conversion * (1 - conversion)
    for price, conversion in zip(prices, conversions)
]

Calculate minimum sample size for unequal variances

[8]:
sample_size = min_sample_size(
    num_samples,
    minimum_detectable_effect,
    variances=variances,
    significance_level=significance_level,
    power_level=power_level,
    equal_variance=False,
)
print(f"Sample size = {sample_size}")
Sample size = 7200

Testing samples with equal ARPU (Average Revenue Per User)

[9]:
print("\nSamples with equal ARPU")
for _ in range(5):
    samples = [
        price * bernoulli.rvs(conversion, size=sample_size)
        for price, conversion in zip(prices, conversions)
    ]
    hypothesis = test_on_marginal_distribution(
        samples, significance_level=significance_level
    )
    print(f"\tAccepted hypothesis H({hypothesis})")

Samples with equal ARPU
        Accepted hypothesis H(0)
        Accepted hypothesis H(0)
        Accepted hypothesis H(4)
        Accepted hypothesis H(0)
        Accepted hypothesis H(0)

Testing where the last sample has higher ARPU by MDE

[10]:
print("\nLast sample has higher ARPU by MDE")
for _ in range(5):
    samples = [
        price * bernoulli.rvs(conversion, size=sample_size)
        for price, conversion in zip(prices, conversions[:-1])
    ]
    samples.append(
        prices[-1]
        * bernoulli.rvs(
            conversions[-1] + minimum_detectable_effect / prices[-1], size=sample_size
        )
    )
    hypothesis = test_on_marginal_distribution(
        samples, significance_level=significance_level
    )
    print(f"\tAccepted hypothesis H({hypothesis})")

Last sample has higher ARPU by MDE
        Accepted hypothesis H(5)
        Accepted hypothesis H(5)
        Accepted hypothesis H(5)
        Accepted hypothesis H(0)
        Accepted hypothesis H(5)
[ ]: