In this article, you’ll learn the following: |
Overview
The expected average is a key concept that helps us understand the typical behavior of visitors on a website. When we collect data, such as the number of visitors and the number of conversions (like purchases), we calculate the average conversion rate by dividing the total conversions by the total number of visitors. This gives us an idea of how the website is performing overall.
However, this average is based on a sample of visitors, not the entire population of all possible visitors. Therefore, it’s an estimate that might be close to the true average but not exactly the same.
Understanding Uncertainty in Averages
Since our average is based on a sample, there is some uncertainty about how close this sample average is to the true average. Statistics help us estimate this uncertainty in sample averages. At VWO, we use Bayesian Statistics to calculate statistical significance
What is Bayesian Statistics?
Bayesian statistics is a way of updating our understanding of something as we get more information. Think of it as starting with an initial guess and then refining that guess as we gather more data.
Introduction to Normal Posterior
In Bayesian statistics, a posterior is an updated understanding of a parameter (like an average) after considering the data we have. The "Normal Posterior" is a specific type of posterior distribution that follows a bell-shaped curve, known as the normal distribution. This distribution helps us quantify the uncertainty around the expected average.
A normal posterior is a term used in Bayesian statistics to describe the updated understanding of an average after taking into account the uncertainty. It’s like saying, "Based on the data we have, the true average is most likely around this value, but it could reasonably be within this range."
Connecting Expected Average and Normal Posterior
-
Start with Empirical Data: We begin with raw data from your website, such as the number of visitors and conversions.
-
Calculate Sample Average: Compute the average conversion rate (conversions divided by visitors).
- Model Uncertainty: Use Bayesian statistics to create a probability distribution (Normal Posterior) that shows a range of possible true averages.
For example, if the normal posterior for the control group shows an expected average of 10%, it might also show that there's a 99% chance the true average is between 7.56% and 12.44%. This range represents the uncertainty in our estimate.
Why Use Normal Posterior?
When we collect a large enough sample (usually more than 30 observations), all averages tend to follow a normal distribution. Since most A/B tests are run on samples much larger than 30, we find normal distributions to be a robust option for analysis - they fit well for most cases and make calculations easier.
At VWO, we use normal posteriors to represent the true population averages for all metrics. This helps us predict the true average behavior of the entire population based on the sample data collected during a test.
Here’s the simplified formula we use:
- For the control group:
- For the variation group:
In these formulae:
- 𝝻 represents the average or the mean value.
- 𝞼 represents the variability that shows how spread out the values are.
- n represents the number of samples (observations).
This helps us predict the true average behavior of the entire population based on the sample data collected during a test.
Conclusion
Each metric modeled using a probability distribution in A/B testing includes an expected value, which is the most probable outcome after full-scale deployment, and an expected interval, indicating the range within which the true average likely falls. As more data accumulates, this expected interval narrows, refining the accuracy of our predictions.
To determine a statistically significant difference between the baseline and the variation, we calculate the expected improvement. This is a direct comparison between the normal posteriors of the baseline and the variation, providing a clearer statistical perspective on whether the observed improvements are genuinely significant or merely the result of sample variability.
At VWO, the expected averages are represented in the second column of the table. By adjusting the table settings, you can also see the expected intervals at all times in a campaign.