0 votes
in Statistics and Probability Basics by

Briefly explain the A/B testing and its application? What are some common pitfalls encountered in A/B testing?

1 Answer

0 votes
by

A/B testing helps us to determine whether a change in something will cause a change in performance significantly or not. So in other words you aim to statistically estimate the impact of a given change within your digital product (for example). You measure success and counter metrics on at least 1 treatment vs 1 control group (there can be more than 1 XP group for multivariate tests).

Applications:

Consider the example of a general store that sells bread packets but not butter, for a year. If we want to check whether its sale depends on the butter or not, then suppose the store also sells butter and sales for next year are observed. Now we can determine whether selling butter can significantly increase/decrease or doesn't affect the sale of bread.

While developing the landing page of a website you create 2 different versions of the page. You define a criteria for success eg. conversion rate. Then define your hypothesis Null hypothesis(H): No difference between the performance of the 2 versions. Alternative hypothesis(H'): version A will perform better than B.

NOTE: You will have to split your traffic randomly(to avoid sample bias) into 2 versions. The split doesn't have to be symmetric, you just need to set the minimum sample size for each version to avoid undersample bias.

Now if version A gives better results than version B, we will still have to statistically prove that results derived from our sample represent the entire population. Now one of the very common tests used to do so is 2 sample t-test where we use values of significance level (alpha) and p-value to see which hypothesis is right. If p-value<alpha, H is rejected.

Common pitfalls:

  1. Wrong success metrics inadequate to the business problem
  2. Lack of counter metric, as you might add friction to the product regardless along with the positive impact
  3. Sample mismatch: heterogeneous control and treatment, unequal variances
  4. Underpowered test: too small sample or XP running too short 5. Not accounting for network effects (introduce bias within measurement)

Related questions

0 votes
asked Apr 29 in AWS by DavidAnderson
0 votes
asked Jun 22, 2020 in JQuery by DavidAnderson
...