In this article, you’ll learn the following: |
Overview
Imagine running a bake sale with two new cookie recipes: chocolate chip and oatmeal raisin. You want to see which sells better, but how long should you bake cookies? And what if you sneak a peek at the sales halfway through?
When you're running A/B tests, you might need to check the results frequently to make timely decisions. This can lead to confusing results and false winners. This article explains two tools VWO provides to help you make accurate decisions, even if you check results early or have multiple variations in your test - Sequential Testing and Bonferroni Correction.
The Peeking Problem
Imagine you're baking cookies, and you want to stop them as soon as they are perfectly baked. However, if you keep opening the oven door to check them too often, you disturb the baking process and let out the heat, which can affect the outcome. Similarly, in A/B testing, frequently checking the results before the test is complete can disrupt the statistical process and increase the chances of a false winner.
The Problem:
Similar to the analogy of the opening door, calculating the statistics many times in a campaign leads to an increase in false winners. However, with the increasing scale of experimentation, it is important to make early decisions so that visitor quota can be saved. Hence, appropriate statistics to tackle the peeking problem are required.
Sequential Testing Correction: Your Peeking Partner
Sequential Testing mode comes in handy when you need to check your results frequently and make decisions appropriately. Think of it like a precise temperature control system in your oven that ensures the cookies will bake perfectly even if you open the door to check multiple times. This mode adjusts the statistical calculations to account for the multiple peeks, ensuring that your test results remain accurate and reliable.
With Sequential Testing, you can open the oven door (check the results) as often as needed without worrying about disrupting the baking process (statistical accuracy). It adjusts the significance levels appropriately, so you can make decisions based on interim results without increasing the risk of a false winner.
Fixed Horizon Testing
Alternatively, you can select the Fixed Horizon Testing Mode which is more robust to weekly fluctuations in visitor behavior. However, you do not get any early winners or disable recommendations that can help save visitors in the campaign.
In Fixed Horizon Mode, the visitor requirements are calculated after 500 visitors, and 1 conversion are collected on the baseline. All statistical calculations are done once after the required number of visitors are collected in the test. Note that Experiment Vital checks go on during the course of the campaign in the Fixed Horizon Mode.
Multiple Recipes, Multiple Chances of Luck
What if you have more than two recipes (variations) in your A/B test? The more recipes you test, the higher the chance of a lucky winner appearing by accident (increased false positive rate).
Bonferroni Correction: Ensuring Fairness
Bonferroni Correction helps when you are running a test with several variations. With multiple recipes, it might be easier for one to seem like a winner by chance. This correction adjusts the results to make sure the chance of a lucky winner stays low, even with many recipes. Imagine giving each recipe a slightly smaller "slice of the pie" (probability of winning) to account for the extra competition.
Where to Access the Statistical Corrections?
- From the main menu, go to the relevant campaign whose reports you’re trying to access.
- Go to the Reports tab.
- On the report header, click on Statistical Configuration.
- The CAMPAIGN SPECIFIC section features the statistical corrections you can use for your campaign. By default, Sequential Testing mode is selected in your campaign. To modify, click on the pencil icon.
5. You can either switch to or parallely apply Bonferroni Correction. Select the relevant checkboxes and click on Save Changes.
When to Apply the Statistical Corrections?
The two corrections provided are for different purposes and should be applied as follows:
- Sequential Testing mode should be applied whenever you plan to act on disable, and winner recommendations in a running campaign before the maximum visitor count is collected.
- Bonferroni Correction should be applied whenever the campaign includes more than one variation (apart from the baseline).
The Trade-off
Both corrections help you increase the accuracy of the campaign winners, but there is a tradeoff. Applying the corrections increases the maximum number of visitors required for the campaign. However, the benefits of the early stopping and increased accuracy outweigh the tradeoffs, and we strongly recommend you apply the relevant corrections whenever required.