Applied Science

We compare two sample sizes in a binomial A/B experiment that tests the null hypothesis \(H_0:\; p_A = p_B\) against the alternative \(H_1:\; p_B - p_A = \Delta > 0\) using a two-sided Z-test on the difference of proportions: \[ Z = \frac{\hat{p}_B - \hat{p}_A} {\sqrt{\hat p\bigl(1-\hat p\bigr) \bigl(\tfrac{1}{n_A} + \tfrac{1}{n_B}\bigr)}}, \qquad \hat p = \frac{k_A + k_B}{n_A + n_B}. \]

The optimal theoretical sample size \(n\) required to achieve a target power \(1-\beta\) at significance level \(\alpha\) is \[ n = 2\, \Bigl[ z_{1-\alpha/2}\,\sqrt{2\bar p\,(1-\bar p)} + z_{1-\beta}\, \sqrt{p_A(1-p_A) + p_B(1-p_B)} \Bigr]^2 \; \big/ \; (p_B - p_A)^2, \] where \(\bar p = \tfrac{p_A + p_B}{2}\).

With Monte Carlo simulations we repeat the experiment thousands of times and approximate: • the empirical type-I error \(\hat\alpha\); • the empirical type-II error \(\hat\beta\); • the distribution of \(\hat\Delta\). We then compare the performance of \(n\) with a custom sample size specified by the user.

Results Table

Scenario	Sample Size	Empirical Alpha	Empirical Beta	Power	Duration

Sample Size

Results Table

Power Curve

Histograms of Δ̂

Empirical α vs β