Power Analysis
- Get link
- X
- Other Apps
1. Conduct power analysis
This step is about calculating the required sample size for an A/B test.
The goal is to make sure the experiment is:
Large enough to detect a real effect,
But not unnecessarily long or expensive.
2. The four key inputs of power analysis
2.1. Effect size
The minimum change you want to be able to detect, for example:
conversion rate increasing from 5% to 5.5%.It should be practically meaningful for the business, not just statistically significant.
Usually defined based on input from product managers and stakeholders.
2.2. Power
The probability of detecting a true effect if it actually exists.
Typically set to 0.8 (80%).
This means: if there is a real effect, you have an 80% chance of detecting it.
2.3. Significance level (alpha)
The probability of a false positive, i.e. concluding there is an effect when there is none.
Typically set to 0.05 (5%).
2.4. Variance
The variability of the metric being measured.
Usually estimated from historical data.
Higher variance, larger sample size is required.
3. Output of power analysis
The main output is the required sample size to detect the Minimum Detectable Effect (MDE).
From the sample size, you can estimate:
How long the experiment needs to run,
Given the current user traffic.
4. Other factors affecting experiment duration
Seasonality: user behavior may differ between weekdays and weekends.
Metric variability over time: daily fluctuations can affect stability.
App adoption: if users must update the app, the update rate affects how fast you collect samples.
5. What is a ramp-up strategy?
A ramp-up strategy means gradually increasing the percentage of users exposed to the experiment.
Purpose:
Reduce risk if there are bugs or negative impacts,
Detect issues early before affecting a large population.
6. Example ramp-up strategy
Day 1: expose 1% of users
Day 2: if metrics look normal, increase to 10%
Day 3: if still stable, increase to 50% to reach the required sample size
7. When is ramp-up not necessary?
For small, low-risk changes, it may be acceptable to start directly at the required sample size.
Ramp-up is recommended, but not mandatory.
- Get link
- X
- Other Apps
Comments
Post a Comment