Free CRO Tool

A/B Test Duration Calculator

Estimate how long to run your A/B test before you launch it. Adjust traffic, conversion rate, MDE, variant count, significance, and power to see how long a reliable result is likely to take.

Calculator Inputs

Results update instantly. MDE is treated as a relative lift over your current conversion rate.

%
1%50%100%

Baseline rate: 5.00%

Target variant rate: 5.50%

Absolute lift to detect: 0.50 percentage points

Estimated Duration

13 days

Calendar time needed to hit the required overall visitor target at your current traffic allocation.

Visitors Per Variant

31,243

Required sample size for each arm of the test using the two-proportion formula.

Visitors Overall

62,486

Total visitors needed across all 2 variants.

Daily Visitors In Test

5,000

This is your effective daily volume after traffic allocation is applied.

Projected End Date

--

Calculated from today using your local date.

Timeline

TodayProjected end date
Duration: 13 daysScale capped at 120 days

Formula Used

n = (Zalpha/2 * sqrt(2 * pavg * (1 - pavg)) + Zbeta * sqrt(p1 * (1 - p1) + p2 * (1 - p2)))2 / (p2 - p1)2

Z values: 90% = 1.645, 95% = 1.96, 99% = 2.576. Power values: 80% = 0.842, 90% = 1.282.

How to plan A/B test duration without guessing

A/B test duration planning is simple in concept and easy to get wrong in practice. Most teams can estimate conversion rate and traffic, but they still launch tests with no realistic sense of how long the experiment needs to run. That creates a familiar pattern: someone checks the dashboard after three days, sees a noisy lift, and starts talking about winners before the test has earned enough data. A duration calculator fixes that problem at the planning stage. Instead of reacting to the early chart, you begin with a traffic target, a minimum detectable effect, and a realistic finish date.

This calculator works backward from the question that matters most: how much evidence do you need before making a decision? The answer depends on your current conversion rate, the size of the lift worth detecting, the confidence threshold you want to use, and the amount of traffic you can actually send into the test. Those inputs define the sample size per variant. Once you know that sample size, estimating duration is just a traffic problem. More eligible visitors means faster decisions. Lower traffic or stricter settings mean longer runtimes.

Why rushing tests leads to bad decisions

Rushing an A/B test is one of the fastest ways to ship the wrong change with false confidence. Early test results are usually volatile because each conversion has an outsized impact when the sample is still small. A variant can look brilliant after 40 visitors and ordinary after 4,000. That swing is not unusual. It is what randomness looks like before enough observations accumulate. If you stop the test the moment one line jumps ahead, you are not being data-driven. You are simply choosing the moment that made the dashboard look most exciting.

Ending a test too soon also creates organizational problems. Teams lose trust in experimentation when the so-called winners fail to hold up after launch. Stakeholders start to think testing is unreliable when the real issue is that the process lacked statistical discipline. A planned duration gives your team a shared expectation up front. It tells everyone that this test needs two weeks, not two days, and that the finish line is tied to evidence rather than impatience.

Good test planning also helps with seasonality. If you know a test needs 24 days to finish, you can decide whether it belongs in the current campaign window or whether it should wait until traffic patterns are more stable. That is much better than launching something right before a promotion, a holiday, or a pricing change and then trying to interpret the result afterward.

How traffic, baseline rate, and variant count affect duration

Traffic volume is the clearest lever. If your site can send 5,000 eligible visitors into a test each day, you can finish much faster than a page that only gets 300. But raw traffic is not the whole story. The percentage of traffic allocated to the experiment matters too. If you only send half your visitors into the test because the rest are excluded, your effective daily test traffic is cut in half and the runtime grows immediately.

Baseline conversion rate changes the math as well. A page converting at 12% often needs fewer visitors than a page converting at 1.2% when you are looking for the same relative lift. That is because rare events are harder to measure with confidence. Minimum detectable effect matters for the same reason. Detecting a bold 20% relative lift is much easier than proving a subtle 3% gain. If the smallest difference you care about is tiny, expect the required sample size to increase fast.

Variant count adds another practical constraint. When you move from a standard two-variant test to three or four variants, each arm still needs enough visitors to stand on its own. That means total visitor demand rises with every extra variant. More ideas in one experiment might feel efficient, but the calendar cost can be significant if your traffic is limited. Many low-traffic teams are better off running fewer, stronger variants rather than stretching traffic across too many arms at once.

Practical ways to reduce test time

The first way to reduce test duration is to test higher-impact ideas. Bigger expected lifts need fewer visitors to detect. That does not mean making random dramatic changes. It means focusing on changes that plausibly affect user intent, clarity, trust, or friction: headline rewrites, offer positioning, pricing layout, form length, or CTA hierarchy. Tiny visual tweaks can still matter, but they are usually slower and more expensive to validate.

The second lever is page selection. If one landing page gets ten times the qualified traffic of another, start there. Experimentation velocity compounds. Faster tests mean faster learning, which means more wins per quarter. You can always bring the same hypothesis to a lower-traffic page later once you have evidence from the page that can reach significance quickly.

You can also reduce duration by narrowing eligibility wisely. That sounds backward, but it matters if your current traffic includes large volumes of users who never had a chance to convert. Cleaner test traffic can improve the signal. Finally, be realistic with significance and power. Higher confidence standards are often worth it for important decisions, but every stricter setting increases the sample size requirement. Match the rigor to the business risk instead of applying maximum strictness to every homepage button test.

Use duration planning before every launch

The best time to estimate test duration is before you build the variant, not after the test starts. A quick planning pass forces better choices about scope, traffic allocation, and expectations. It tells you whether the experiment fits the next sprint, whether it is realistic for the traffic you have, and whether the lift you want to detect is ambitious enough to justify the effort. That is why disciplined growth teams treat duration planning as part of the experiment brief, not as an optional analytics step.

If the estimate is too long, that is still useful information. It may mean you should test a bolder change, move to a higher-traffic page, simplify the number of variants, or wait until you can route more qualified traffic into the experiment. What matters is making that call early, before people start reading too much into noisy early results.

Move from planning to launch

Once you know how long a test should run, the next step is getting it live without turning the setup into a project of its own.