|
This article covers the following: |
Overview
A holdout group is a small, consistent set of users who are intentionally excluded from specific feature flags or all new features for a predefined period. By comparing the performance of this group against the rest of your audience, you gain an unbiased look at your true Return on Investment (ROI) and long-term conversion lift.
Holdout groups enable you to measure the cumulative impact of multiple features over time. While individual features might be successful on their own, a holdout group provides a statistically valid way to quantify whether the collective changes to your product are genuinely improving key metrics or simply adding noise.
For example, imagine an e-commerce team launching a series of major updates over a quarter: a multi-step guest checkout, a Frequently Bought Together recommendation widget, and a dynamic search-as-you-type bar. While you might A/B test each feature individually and see small wins, it is difficult to see the big picture impact on your bottom line. By maintaining a holdout group, you keep a small segment of users on the original baseline version of your store.
At the end of the quarter, you can compare the holdout group (who did not see any new features) against the exposed group (who experienced all three new features). If the exposed group shows a significant increase in Average Order Value (AOV) compared to the holdout group, it indicates that the users exposed to your product changes are generating more revenue. This difference provides evidence that your collective product roadmap, rather than random fluctuations in user behavior, is driving the revenue growth.
Key Benefits
- Quantify Cumulative Impact: Measure how multiple features work together to move your True North metrics over months, not just individual sprints.
- Scientific Precision: Separate the impact of your feature releases. Even with a small holdout (as low as 1% to 5% of traffic), VWO’s Bayesian statistical engine can accurately predict incremental lift and long-term trends.
- Simplified Implementation: The VWO SDK automatically handles user bucketing and serves default values to the holdout group, so developers do not have to write custom code.
- Unbiased Reporting: Access side-by-side comparisons of metric conversions, calculated uplift, and expected improvement directly within the VWO interface.
Prerequisites
-
You must have:
- A VWO account with Feature Experimentation enabled.
- At least one feature flag created.
- The associated feature flag must be active: For a holdout to start collecting data and function correctly, the feature flag linked to the holdout must be enabled and have at least one active rule (rollout, A/B, MVT, or Personalize).
- VWO FE SDKs must be updated to the latest version to support holdout configuration and automatic default variable serving.
Configure Holdout Groups
To configure a holdout group, you must set it up and define how it is evaluated. This includes creating the group and then specifying the metrics to measure its performance, along with the traffic allocated to it. The following sections provide step-by-step instructions for each part of this process.
Create a Holdout Group
Creating a holdout group involves defining the environment and selecting relevant feature flags. Create the holdout group by following these steps:
- Log in to your VWO account.
- Go to Feature Experimentation > Holdouts.
-
On the Holdout groups listing page, click Create a holdout group. If no holdouts exist, you can also click Click here to create your first holdout.
- From the Environment dropdown menu, select the environment (such as Prod or Staging) for the holdout.
- In the Add Flags section, either select the specific feature flags you want to include in the holdout group, or switch on the Make this holdout global toggle to automatically include all feature flags created after the holdout is enabled.
- Click Add description, and enter a brief summary of the holdout group’s purpose or scope for future reference.
-
Click Save Now.
After clicking Save, you are returned to the Holdout Groups listing page. At this stage, your holdout group is configured but not yet collecting data. Next, you must establish the success criteria to measure long-term impact. Proceed to Define Metrics and Traffic for the Holdout Group to configure your tracking and allocation settings.
Define Metrics and Traffic for the Holdout Group
After configuring the flags, define how success is measured and how much traffic is allocated to the holdout.
Complete this setup by following these steps:
- In the Metrics tab, click Add to define your Primary metric. This metric represents the core business outcome you want to evaluate, such as activation, revenue, or retention. This analysis measures the overall impact of your feature releases over time. By comparing this metric between holdout users and those exposed to your product changes, you can determine whether your roadmap is driving meaningful improvements.
-
Add Secondary metrics to gain additional insights into the performance of the holdout.
- Go to the Audience and Traffic section.
-
Under Traffic Allocation, use the slider to set the percentage of targeted users to hold back.
Note: While holdout groups are scientifically valuable, VWO limits the allocation to 10% to ensure the vast majority of your users still receive your latest product improvements. Often, 1-2 % is sufficient for long-term data collection. - Click Save Now to finalize the setup.
- After you have defined your metrics and traffic allocation, click Start Holdout, then confirm your selection in the pop-up to launch the group and begin data collection.
Once the group is active, the VWO SDK manages user assignment automatically through a deterministic bucketing process.
How Does VWO Assign Users to Holdout Groups
In VWO Feature Experimentation, users are assigned to a holdout group through a process called bucketing, which occurs at the SDK level. This ensures that a user’s experience remains consistent across different sessions and devices.
The assignment follows these technical principles:
-
Unique User Identifier (UUID)
The SDK uses a unique identifier (such as userId) provided in your code. The SDK uses this ID to determine whether the user should be included in the holdout group. -
Deterministic Hashing
VWO applies a deterministic hashing algorithm (MurmurHash) to the user ID to generate a numeric value. If that value falls within your defined traffic allocation, for example, the lowest 5% for a 5% holdout, the user is assigned to the holdout group. Because the hashing is deterministic, as long as the userId remains the same, the user is consistently assigned to the same cohort, ensuring a stable and predictable experience across sessions. -
Traffic Allocation Logic
Assignment happens at the point of the SDK's getFlag call:- Evaluation: The SDK first checks if a holdout group is defined for the environment.
- Exclusion: If the user's ID falls within the holdout bucket, the SDK automatically returns the default value for all variables associated with that holdout.
- Exposure: If the user is not in the holdout bucket, they proceed to the standard evaluation rules (Targeting and Traffic Split) for the specific feature flag.
Note:
- If you change the percentage of a holdout group while it is live, some users may be reshuffled into or out of the group. It is recommended to determine your allocation percentage before the measurement period begins.
- If the holdout configuration must be modified mid-way and a consistent user experience needs to be preserved, consider using a storage service to persist user assignments. This ensures users continue to see the same experience even if allocation settings change.
Verify the Holdout Setup
To ensure your holdout group is functioning correctly:
- Navigate to Feature Experimentation > Holdouts.
- Select your holdout group and open the Reports tab.
- Confirm that traffic is being split between Holdout (Baseline) and Not in Holdout.
- Verify that metrics are being recorded for both cohorts.
Verify the Impact of Holdout Groups
Once your holdout is live and collecting data, you can analyze the results to determine if your product is actually better or just different.
- Go to the Feature Experimentation > Holdouts listing page.
- Select your active holdout and navigate to the Reports tab.
- Review the data split between Holdout (Baseline) and Not in Holdout.
-
Check the Unique Conversions, Expected Conversion Rate, and Expected Improvement for your primary metric.
Troubleshooting
Issue |
Possible Cause |
Recommended Solution |
| Users in the holdout group are seeing new features. | The SDK is outdated, or the environment is mismatched. | Ensure you select the correct environment during setup, and verify your VWO FE SDK is updated to the latest version. |
| Cannot allocate more than 10% traffic to a holdout. | This is a system-defined guardrail. | VWO limits holdouts to 10% to prioritize user experience. Use a smaller percentage (1-5%) for long-term tracking. |
| New feature flags are not being respected by the holdout. | The holdout was not set to Global. | If the holdout is selective, you must manually add new flags to it. Use the Global setting to automatically include all future flags. |
| Two global holdouts show different metric results. | Creation timing difference. | Global holdouts apply only to feature flags created after the holdout began. If you have two global holdouts started at different times, their reports will not match because they track different sets of flags. |
| User experience changed after a holdout configuration update. | Dynamic re-bucketing. | If you change holdout settings, such as traffic percentage, the bucket decision for a user may change. To maintain a permanent decision for a user regardless of config changes, implement a User Storage Service in your SDK integration. For example, to implement a Node SDK Storage Service, refer to FE Node Storage. |
FAQs
-
Does a holdout group affect my A/B tests?
Yes. Users in a holdout group consistently see the default experience for any feature flag associated with that holdout, regardless of active experiments (A/B tests, rollouts, or multivariate tests) linked to that flag. Holdouts are applied at the feature flag level, not at the individual rule level. This means if a feature flag is part of a holdout group, all rules under that flag, including testing, rollout, or MVT rules, respect the holdout exclusion.
-
Can I run multiple holdouts at once?
Yes, you can have multiple holdouts for different environments or for specific features.
-
What happens to a user added to a holdout group if I delete the holdout group?
The user is released from the holdout group and becomes eligible to see personalized experiences or new features based on the active feature flag rules.
-
How does bucketing work if a user is part of a holdout that covers multiple different feature flags?
VWO uses deterministic bucketing at the SDK level. Once a user's unique identifier, for example, userId, falls into a holdout group’s bucket, that assignment persists for that user across feature flags.
-
What happens if a single feature flag is associated with multiple holdout groups?
VWO evaluates each holdout independently for the user. A user may be bucketed into one holdout and excluded from another based on the configurations for each holdout. Each holdout report independently represents its respective data.
Need more help?
For further assistance and more information, contact VWO Support.