Measured Incrementality Testing: Are Your Ads Working?

Measured incrementality testing is the practice of isolating the causal effect of a marketing activity by comparing outcomes between an exposed group and a holdout group that did not see the activity. It answers one question that most marketing dashboards cannot: would those sales have happened anyway, without the ad?

Most attribution models tell you who converted after seeing an ad. Incrementality testing tells you whether the ad caused the conversion. That is a different question entirely, and the gap between the two numbers is often where marketing budgets go to die quietly.

Key Takeaways

Incrementality testing measures causal lift, not correlation. Last-click attribution and even multi-touch models routinely take credit for sales that would have happened regardless of ad exposure.
A holdout group is non-negotiable. Without a clean control, you are measuring coincidence, not impact.
Ghost ads and geo-based holdouts are the two most practical test designs for most marketing teams. Each has real trade-offs worth understanding before you commit.
Incrementality results rarely flatter every channel equally. Branded search and retargeting consistently over-claim. Upper-funnel channels are often under-credited.
The goal is not perfect measurement. It is honest approximation that gives you enough signal to make better budget decisions than you would make without it.

Why Attribution Models Are Not Enough
How Does an Incrementality Test Actually Work?
What Incrementality Testing Reveals That Attribution Hides
Designing a Test That Gives You Usable Results
How to Interpret Incrementality Results Without Overreacting
When Incrementality Testing Makes Sense and When It Does Not
Building Incrementality Into Your Measurement Routine

Why Attribution Models Are Not Enough

When I was managing paid search at scale, the reporting always looked good. Click-through rates were healthy, conversion tracking was firing, ROAS was above target. The dashboards told a clean story. The problem was that the dashboards were measuring activity, not causality. They were counting the number of people who saw an ad and then bought something. They were not measuring whether the ad had anything to do with the purchase.

This is the foundational flaw in most performance marketing measurement. Conversion tracking tells you a conversion happened. It does not tell you whether your marketing caused it. A customer who had already decided to buy and then clicked a branded search ad before checkout is not evidence that the ad worked. It is evidence that the ad was present at the moment of purchase. Those are not the same thing.

Multi-touch attribution models try to distribute credit more fairly across the customer experience. Some do a reasonable job. But they are still working from observed behaviour, not from a controlled experiment. They can tell you which touchpoints were present in converting journeys. They cannot tell you which touchpoints changed the outcome.

Incrementality testing is the only method that gets close to answering the causal question. It is not perfect. No measurement approach in marketing is. But it is structurally more honest than any attribution model, because it requires you to deliberately withhold your marketing from a group of people and measure what happens to them. That discipline is what makes it valuable.

If you want to go deeper on how measurement fits into a broader analytics framework, the Marketing Analytics hub covers the full picture, from GA4 setup to attribution modelling to reporting that actually informs decisions.

How Does an Incrementality Test Actually Work?

The mechanics are straightforward in principle and genuinely difficult in practice.

You split your audience into two groups: a test group that receives the marketing activity as normal, and a holdout group that does not. After a defined test period, you compare the conversion rate, revenue, or whatever outcome metric you care about between the two groups. The difference between them is your incremental lift. That lift, expressed as a percentage of the test group’s performance, tells you how much of your marketing effect is genuinely causal rather than coincidental.

The hard part is maintaining a clean holdout. In digital advertising, this typically means one of two approaches.

The first is a ghost ad or PSA holdout. You serve the holdout group a public service announcement or a placeholder ad in the same inventory where your real ad would have appeared. This ensures the holdout group had the same opportunity to see an ad, but saw something neutral instead. It controls for the fact that some inventory is inherently higher-converting than others. This approach requires platform support and is most practical on Meta, where the feature exists natively.

The second is a geo-based holdout. You identify matched geographic markets, assign some to test and some to control, run your activity in the test markets, and suppress it in the control markets. You then compare performance across markets, adjusting for any baseline differences. This approach works across channels and does not require platform-level holdout features, but it introduces geographic confounds: local events, weather, competitor activity, and regional seasonality can all contaminate the results if your markets are not well matched.

Both approaches require statistical rigour. Your test needs to run long enough to accumulate sufficient conversions in both groups. The minimum detectable effect you are trying to measure needs to be realistic relative to your sample size. And you need to resist the temptation to call the test early when early results look good, because early results are almost always noisy.

What Incrementality Testing Reveals That Attribution Hides

I have seen incrementality results that were uncomfortable to present. That is, frankly, the point of running them.

Branded search is the most consistent offender. When you test whether branded paid search drives incremental conversions over and above organic, the results are frequently humbling. A significant proportion of the people clicking your branded paid ads would have found you through organic search anyway. You are paying to intercept your own customers. The attribution model shows a healthy ROAS. The incrementality test shows you are largely buying conversions you already owned.

Retargeting has the same problem, often worse. Retargeting audiences are, by definition, people who have already shown intent. They visited your site, put something in their basket, spent time on a product page. These are high-intent users who have a naturally elevated conversion rate. When you retarget them and they convert, the attribution model credits the retargeting ad. The incrementality test asks whether they would have converted without it. For a meaningful portion of retargeting audiences, the honest answer is yes.

Upper-funnel channels tend to show the reverse pattern. Display, video, and awareness-led social activity often look weak in last-click or even multi-touch attribution. But when you run incrementality tests that capture downstream conversion behaviour over a longer window, you frequently find genuine lift that the attribution model was not crediting. The channel was doing real work. It just was not the last touchpoint, so it was invisible in the standard report.

This is one of the reasons I have always been cautious about optimising too aggressively on attributed ROAS. It tends to concentrate spend in channels that are good at claiming credit, not necessarily in channels that are good at creating demand. Incrementality testing gives you a more honest signal to work from.

Designing a Test That Gives You Usable Results

The difference between an incrementality test that changes how you allocate budget and one that produces inconclusive results usually comes down to design decisions made before the test starts.

Start with a clear hypothesis. You are not running an incrementality test to find out whether marketing works in general. You are testing a specific channel, campaign type, or audience segment. “Does our Facebook prospecting campaign drive incremental purchases among users who have never previously visited our site?” is a testable hypothesis. “Does our marketing work?” is not.

Define your primary metric before the test begins. Revenue per user, conversion rate, and average order value will often tell different stories. Decide which one you care about most and commit to it. Switching metrics after you see the results is how you end up fooling yourself.

Size your holdout appropriately. A holdout that is too small will leave you with insufficient statistical power to detect a real effect. A holdout that is too large means you are deliberately suppressing revenue during the test period, which has a real cost. For most campaigns, a holdout of 10 to 20 percent of the target audience is a reasonable starting point, though the right number depends on your conversion volume and the effect size you are trying to detect.

Agree on a test duration before you start. Conversion cycles vary enormously by category. A subscription software product with a 30-day free trial needs a longer test window than an e-commerce brand with a two-day consideration cycle. As a rough guide, you want enough time to accumulate at least 100 to 200 conversions in each group, and enough time for the conversion cycle to complete fully for users exposed early in the test.

Proper tracking setup is a prerequisite. If your conversion data is incomplete or inconsistently attributed, your incrementality results will be unreliable regardless of how well you design the test. It is worth making sure your GA4 setup is solid and that your UTM parameters are consistently applied before you start drawing conclusions from holdout comparisons.

How to Interpret Incrementality Results Without Overreacting

Incrementality results are a signal, not a verdict. The most common mistake I see is treating a single test as definitive proof that a channel either works or does not.

A test that shows low incremental lift for a channel does not necessarily mean you should cut the channel. It means you should understand why the lift is low. Is the audience too heavily weighted toward existing customers who would have converted anyway? Is the creative too focused on conversion messaging to people who are not yet ready to convert? Is the bidding strategy chasing the same high-intent signals that your organic traffic is already capturing? These are all fixable problems. The incrementality result tells you there is a problem. It does not automatically tell you what the problem is.

Equally, a test that shows strong incremental lift does not mean you should immediately scale spend. It means the channel is driving real outcomes at the current spend level and audience composition. As you scale, you will typically move into audiences with lower baseline intent, and your incremental lift rate will decline. The lift you measured at current scale is not the lift you will get at two times the budget.

The most useful thing you can do with incrementality results is triangulate them with your other data sources. If your incrementality test shows that retargeting is mostly capturing conversions that would have happened anyway, check your organic conversion rate for the same audience segments. Look at your direct traffic conversion behaviour. Look at what happens to conversion rates in the weeks after someone visits your site, regardless of whether they were retargeted. The incrementality result is one input into a broader picture, not the whole picture.

When I was at iProspect, we grew from a team of around 20 people to over 100, and managing that kind of growth meant making budget decisions across dozens of client accounts simultaneously. The clients who made the best decisions over time were not the ones who had the most sophisticated attribution models. They were the ones who were willing to run uncomfortable tests and act on the results, even when the results challenged their assumptions about which channels were working.

When Incrementality Testing Makes Sense and When It Does Not

Incrementality testing is not the right tool for every situation. It requires meaningful conversion volume, a degree of platform access or geographic flexibility, and the organisational willingness to suppress activity in a holdout group for the duration of the test. Not every business or campaign is set up for it.

It is most valuable when you are spending enough in a channel that a 20 or 30 percent reduction in attributed conversions would materially change your budget allocation decision. If you are spending a few hundred pounds a month on a channel, the cost of designing and running a rigorous incrementality test is probably not justified. If you are spending tens of thousands a month, the question of whether that spend is genuinely incremental is one of the most important questions you can ask.

It is also most valuable when you are making a strategic decision about channel mix rather than a tactical optimisation within a channel. If you want to know whether to invest more in prospecting versus retargeting, or whether your brand awareness activity is contributing to lower-funnel performance, incrementality testing gives you evidence that no attribution model can. If you want to know which ad creative performs better within a campaign, A/B testing is a more efficient tool.

For businesses with lower conversion volumes, there are lighter-touch approaches worth considering. Conversion lift studies offered natively by Meta and Google are not as rigorous as a fully designed incrementality test, but they are better than attribution alone. Marketing mix modelling, which uses statistical regression to estimate channel contributions from aggregate data rather than individual-level tracking, is another option, particularly for businesses where cookie-based measurement is unreliable or where the conversion cycle is long enough that user-level attribution breaks down. Understanding how these approaches connect to your broader analytics stack is something the Marketing Analytics hub covers in more depth, including how to build reporting frameworks that hold up under scrutiny.

Building Incrementality Into Your Measurement Routine

The businesses that get the most value from incrementality testing are not the ones that run a single test and declare the matter settled. They are the ones that build a testing cadence into their measurement practice, running tests regularly enough that they have a continuously updated picture of channel incrementality across different audience segments, creative approaches, and spend levels.

This does not require a dedicated measurement team. It requires a commitment to asking the causal question on a regular basis and building the discipline to design tests properly before you run them. Most marketing analytics platforms, including Google Analytics, can be configured to support holdout analysis if you are deliberate about how you structure your audience segments and conversion tracking. More sophisticated reporting setups, including tools like Tableau for visualising test results across cohorts, can make the analysis considerably easier to present to stakeholders who are not statisticians.

The most important thing is to treat incrementality results as a standing input into budget decisions rather than a one-time exercise. Channel performance shifts over time. An audience that showed strong incremental lift 18 months ago may be much more saturated now. A channel that looked weak when you first tested it may have improved as your creative matured. The measurement needs to be ongoing to stay useful.

Early in my career, I built a website from scratch because the business would not give me the budget for one. The lesson I took from that experience was not really about web development. It was about the value of doing the uncomfortable thing rather than accepting the easier story. Incrementality testing is the same discipline applied to measurement. It is harder than reading your attribution dashboard. It will sometimes tell you things you do not want to hear. But it is the only way to know whether your marketing is actually working, rather than just appearing to work in the data you have chosen to look at.

Marketing measurement does not need to be perfect. It needs to be honest. Incrementality testing, done with reasonable rigour and interpreted with appropriate humility, gets you closer to honest than almost anything else in the measurement toolkit.

About the Author

Keith Lacy is a marketing strategist and former agency CEO with 20+ years of experience across agency leadership, performance marketing, and commercial strategy. He writes The Marketing Juice to cut through the noise and share what works.

Frequently Asked Questions

What is the difference between incrementality testing and A/B testing?

A/B testing compares two versions of something, such as two ad creatives or two landing pages, to determine which performs better among people who are exposed to the test. Incrementality testing compares an exposed group to a holdout group that receives no marketing activity at all, to determine whether the marketing itself is causing conversions rather than just being present when conversions happen. They answer different questions and are best used for different decisions.

How large does a holdout group need to be for an incrementality test?

There is no universal answer, because the right holdout size depends on your conversion volume and the size of the effect you are trying to detect. A common starting point is 10 to 20 percent of the target audience. The more important constraint is that both your test and holdout groups need to accumulate enough conversions during the test period to reach statistical significance. If your conversion volumes are low, you may need a longer test window rather than a larger holdout percentage.

Which channels benefit most from incrementality testing?

Branded paid search and retargeting are the channels where incrementality testing most often reveals over-claimed attribution, because both channels target audiences with high baseline intent who may well have converted without the ad. Upper-funnel channels like display and video awareness campaigns are often under-credited in standard attribution and frequently show stronger incremental lift than their attributed performance suggests. Any channel where you are spending enough that a significant reduction in measured effectiveness would change your budget decisions is a candidate for testing.

Can you run an incrementality test without platform support?

Yes. Geo-based holdout tests do not require platform-level holdout features. You identify matched geographic markets, run your campaign in the test markets, suppress it in the control markets, and compare outcomes. This approach works across channels and is particularly useful for TV, out-of-home, and other channels where user-level holdouts are not technically feasible. The trade-off is that geographic confounds, such as local events or competitor activity, can contaminate the results if your markets are not well matched at baseline.

How is incrementality testing different from marketing mix modelling?

Incrementality testing uses a controlled experiment with a holdout group to measure the causal effect of a specific campaign or channel. Marketing mix modelling uses statistical regression on aggregate data, typically historical sales and spend data across channels, to estimate the contribution of each channel to overall performance. Incrementality testing gives you cleaner causal evidence for specific activities. Marketing mix modelling gives you a broader view across the full channel portfolio without requiring you to suppress activity in a holdout group. The two approaches are complementary rather than competing.

Measured Incrementality Testing: Are Your Ads Working?

Key Takeaways

In This Article

Why Attribution Models Are Not Enough

How Does an Incrementality Test Actually Work?

What Incrementality Testing Reveals That Attribution Hides

Designing a Test That Gives You Usable Results

How to Interpret Incrementality Results Without Overreacting

When Incrementality Testing Makes Sense and When It Does Not

Building Incrementality Into Your Measurement Routine

About the Author

Frequently Asked Questions

What Separates a Good CMO From an Expensive One

Service Blueprint vs Journey Map: Which One Fixes Your CX

Customer Loyalty Trends That Are Reshaping Retention Strategy

Influencer Generated Content: Who Owns It?

Market Entry Plan: What Most Brands Get Wrong

Google Ad Manager Account: A Practical Guide for Paid Media Teams

ABOUT

EXPLORE

CONNECT

Get sharp marketing thinking, weekly

Key Takeaways

In This Article

Why Attribution Models Are Not Enough

How Does an Incrementality Test Actually Work?

What Incrementality Testing Reveals That Attribution Hides

Designing a Test That Gives You Usable Results

How to Interpret Incrementality Results Without Overreacting

When Incrementality Testing Makes Sense and When It Does Not

Building Incrementality Into Your Measurement Routine

About the Author

Frequently Asked Questions

Similar Posts

ABOUT

EXPLORE

CONNECT