In this article, you’ll learn the following: |
Overview
Imagine you're running a test on your website, where Version A (baseline) has a 10% conversion rate, and Version B (variation) has an 11% conversion rate. To understand which version performs better, you need to measure the improvement. Improvement in testing involves comparing the conversion rates of the baseline and the variation.
Each variation has its improvement measure compared to the baseline. The baseline itself doesn’t have an improvement statistic because comparing something to itself doesn’t provide any new insights.
Understanding Improvement and Bayesian Inference
Just like sample averages, you can calculate improvement from the empirically observed data in the campaign. However, this number represents the observed uplift for the data that has been collected in the test up till now. With Bayesian Statistics, VWO calculates the projected uplift that can be expected if a variation is deployed. This is represented by an expected improvement and a confidence interval.
Introduction to Improvement Distribution
An improvement distribution is a statistical way of summarizing the potential improvement of a variation over the baseline. It helps in understanding how the variation is likely to perform if applied to the entire population of visitors.
Calculation of Improvement Distribution
Improvement distribution is calculated by determining the difference between the posterior of the variation and the baseline.
These equations are not scaled by the baseline average, which is only done when displaying the distribution in reports. Since we model these metrics as normal distributions, the resulting improvement distribution is also normal. This statistical model gives a percentage representing how much better or worse the variation performs compared to the baseline.
The Power of Improvement Distribution
The real strength of improvement distribution lies in its ability to provide a probabilistic understanding of potential improvements. ROPE, which stands for Region of Practical Equivalence, is a crucial concept in this context. By setting appropriate ROPE values, businesses can define a range within which changes are considered practically insignificant. This allows them to determine the likelihood of achieving a certain level of improvement with greater precision, offering a robust basis for decision-making. The improvement distribution is divided into three regions:
- Worse: (-infinity to -ROPE)
- Equivalent: (-ROPE to +ROPE)
- Better: (+ROPE to infinity)
Conclusion
Improvement distributions encapsulate both the expected uplift and the chance of the uplift being significant. Both of these metrics are crucial for businesses aiming to make data-driven decisions about website optimizations.
At VWO, we display the improvement distribution, providing an expected value (the most probable outcome if the variation is implemented) and an expected interval (the range within which the actual outcome is likely to fall). As more data is collected, these intervals become narrower, increasing the reliability of predictions.
All statistical corrections are applied to the improvement distribution, and the final adjusted improvement represents the variation’s performance. In VWO reports, the improvement distribution graph, the expected value, and the expected intervals are shown. If the In-depth data review mode is enabled, the improvement intervals are displayed in graphical form for quick inference of variation performance.
This understanding allows for a more comprehensive analysis and aids in making informed decisions about which website variations to implement.