Propensity Modeling: Predict Who Buys Before They Do

Propensity modeling is a statistical technique that estimates the likelihood of a specific individual taking a specific action, whether that’s making a purchase, churning, clicking, or converting. It takes historical behavioral data, runs it through a predictive model, and outputs a score for each person in your audience that tells you how likely they are to do the thing you care about.

Done well, it shifts your marketing from broadcasting to targeting. Instead of spending budget on everyone who looks vaguely similar to your customer, you spend it on the people most likely to actually convert. That distinction, between who looks like a customer and who is likely to become one, is where propensity modeling earns its place in a serious analytics stack.

Key Takeaways

  • Propensity modeling scores individuals by their likelihood to convert, churn, or take a specific action, using historical behavioral data as its foundation.
  • The quality of your model depends almost entirely on the quality and relevance of your input data, not the sophistication of the algorithm.
  • Propensity scores are most valuable when they inform budget allocation and audience segmentation, not just as a reporting metric.
  • Most teams underuse propensity modeling because they conflate it with lookalike audiences, which measure similarity rather than likelihood.
  • Propensity models decay. A model built on pre-pandemic behavior will mislead you if you run it in a different market context without retraining.

If you want to situate propensity modeling within a broader analytics framework, the Marketing Analytics and GA4 hub covers the wider measurement landscape, from attribution to data infrastructure, and is worth reading alongside this piece.

What Problem Does Propensity Modeling Actually Solve?

The honest answer is that it solves a budget allocation problem. You have a finite amount of money. You have an audience of varying quality. Propensity modeling helps you identify where in that audience your money will work hardest.

I spent time early in my career at lastminute.com, where we were running paid search campaigns against audiences with wildly different intent signals. One of the things that struck me then, and still does, is how much budget gets wasted on people who were never going to convert regardless of how good the ad was. We’d launch a campaign for a music festival and see six figures of revenue land within a day. But we also knew that a meaningful chunk of impressions and clicks were going to people who were browsing, not buying. Propensity modeling is essentially the systematic version of learning to tell those two groups apart before you spend the money.

The technique is particularly useful in three scenarios. First, when you have a large CRM database and want to prioritise outreach. Second, when you’re running paid media and want to suppress low-propensity audiences or increase bids on high-propensity ones. Third, when you’re trying to predict churn and want to intervene before a customer leaves, rather than after.

How Does a Propensity Model Actually Work?

At its core, a propensity model is a classification or regression model trained on historical data. You take a set of people for whom you know the outcome (they converted or they didn’t, they churned or they didn’t), you feed the model a set of features about those people (recency of last purchase, number of sessions, product categories browsed, email engagement, demographic signals), and the model learns which combinations of features are predictive of the outcome.

Once trained, you apply the model to your current audience, and it outputs a score, typically between 0 and 1, representing the probability of that person taking the target action. You then rank your audience by score and make decisions accordingly.

The most commonly used algorithms for propensity modeling include logistic regression, gradient boosting models like XGBoost, and random forests. Logistic regression is often underrated here. It’s interpretable, fast to train, and performs surprisingly well when your feature engineering is solid. More complex models can squeeze out marginal performance gains, but they’re harder to explain to a client or a CFO, and in my experience, explainability matters more than people admit when budgets are on the line.

The features you select matter more than the algorithm you choose. A well-engineered feature set fed into logistic regression will outperform a poorly specified gradient boosting model almost every time. This is where the real analytical work happens, and it’s where most teams underinvest.

What Data Do You Need to Build a Propensity Model?

You need historical data that contains both the features you want to use as predictors and the outcome you’re trying to predict. That sounds obvious, but it rules out a lot of organisations that either haven’t been collecting the right signals or haven’t been storing them in a way that’s usable.

The minimum viable dataset typically includes: a unique identifier per person, a set of behavioral features (session frequency, recency, depth of engagement, product interactions), and a clear binary or continuous outcome variable. For purchase propensity, that’s whether they bought within a defined window. For churn propensity, that’s whether they cancelled or went inactive.

One of the persistent gaps I’ve seen is that teams track plenty of activity data but haven’t connected it to individual-level outcomes in a way that’s model-ready. They know what happened on the website, but they can’t link it back to the specific person and what they did next. This is where getting your GA4 implementation right matters, and understanding what data GA4 goals are unable to track is a useful starting point for identifying where your data has blind spots before you try to build on top of it.

For richer models, you’d layer in first-party CRM data (purchase history, lifetime value, product categories, support interactions), email engagement data (open rates, click rates, time since last engagement), and where available, third-party demographic or firmographic data. The more complete your picture of individual behavior, the better the model will perform.

If you’re exporting GA4 data to BigQuery for model training, Moz’s Whiteboard Friday on GA4 to BigQuery exports is a practical reference for understanding what that pipeline looks like and what you gain from it.

Propensity Modeling vs Lookalike Audiences: Not the Same Thing

This distinction matters and gets blurred constantly. Lookalike audiences, the kind you build in Meta or Google, identify people who share characteristics with your existing customers. They measure similarity. Propensity models measure likelihood. Those are different questions.

A person can look very similar to your best customer and still have low propensity to convert right now, because they’re at the wrong stage of their decision process, because the category isn’t salient for them at this moment, or because the specific product doesn’t fit their situation. Conversely, someone who doesn’t look like your typical customer profile might have very high propensity based on their recent behavioral signals.

Lookalike audiences are a reasonable top-of-funnel tool for prospecting. Propensity models are more suited to mid-to-lower funnel decisions, where you have enough behavioral data on individuals to make a meaningful prediction. Using both in the right places is sensible. Conflating them is where teams go wrong.

This also connects to how you think about attribution. Propensity modeling tells you who is likely to convert. Attribution tells you which channels and touchpoints influenced that conversion. They’re complementary, and understanding how they interact is part of building a measurement system that actually supports decisions. If you haven’t thought carefully about your attribution framework, attribution theory in marketing is worth reading before you try to layer propensity scoring on top of an attribution model that isn’t working.

How Do You Use Propensity Scores in Practice?

The score itself is not the deliverable. What you do with it is. There are four practical applications that consistently generate commercial value.

The first is paid media bid adjustment. If you’re running programmatic or paid search campaigns and you can pass propensity scores into your bidding logic, you can increase bids for high-propensity individuals and suppress or reduce bids for low-propensity ones. This is one of the clearest ways to improve return on ad spend without changing creative or copy.

The second is CRM prioritisation. If your sales or retention team has a finite number of outreach slots, propensity scores let you rank the list and work the highest-value opportunities first. I’ve seen this applied in B2B contexts where a sales team of ten people had a prospect list of several thousand. Propensity scoring cut the effective working list down to the 200 accounts worth prioritising that month, and conversion rates on those outreach efforts were meaningfully higher than the untargeted approach they’d been using before.

The third is email segmentation. Rather than sending the same message to your entire database, you segment by propensity band and tailor the message and offer accordingly. High-propensity contacts might get a direct call to action. Low-propensity contacts might get nurture content designed to build category salience. This is a more sophisticated version of what most email platforms support natively, and it’s one of the reasons tracking the right email metrics matters so much when you’re trying to validate whether your segmentation is working.

The fourth is churn prevention. Churn propensity models identify customers showing early disengagement signals before they cancel. The intervention window is usually short, so having a model that flags high-risk customers two to four weeks before typical churn events gives your retention team time to act. This is where propensity modeling has some of its clearest ROI, because the cost of retaining a customer is almost always lower than the cost of acquiring a new one.

What Makes a Propensity Model Fail?

Most propensity model failures are data problems dressed up as model problems. The four I see most often are these.

Label leakage. This happens when your training data accidentally includes features that are only available after the outcome has occurred. If you include “sent a cancellation email” as a feature in a churn model, you’re training on information you wouldn’t have had at prediction time. The model looks brilliant in training and useless in production.

Class imbalance. If 2% of your audience converts and 98% doesn’t, a model that predicts “no conversion” for everyone will be 98% accurate and completely useless. You need to handle class imbalance deliberately, either through resampling techniques, adjusted class weights, or by evaluating the model on metrics like AUC-ROC or precision-recall rather than raw accuracy.

Model decay. A model trained on behavior from 18 months ago may not reflect how your audience behaves today. Markets shift, product mixes change, customer expectations evolve. I’ve seen teams build a propensity model once, declare it done, and then wonder why performance is degrading six months later. Models need to be retrained on a schedule that reflects how quickly your market changes.

Treating the score as a fact rather than a probability. A propensity score of 0.85 doesn’t mean that person will definitely convert. It means they’re in a cohort that historically converts at a high rate. There’s still uncertainty, and decisions made on propensity scores should account for that. When I was running agency teams and we were presenting propensity-informed audience strategies to clients, the discipline of explaining what the score means and what it doesn’t mean was as important as the model itself. Overselling predictive confidence is how you lose credibility when the model doesn’t perform perfectly.

Propensity Modeling and Incrementality

One nuance that often gets missed: propensity models predict who is likely to convert, but they don’t tell you who is likely to convert because of your marketing. Those are different questions, and conflating them leads to wasted spend.

Your highest-propensity individuals may be the people most likely to convert anyway, regardless of whether you advertise to them. If you concentrate your spend on the top propensity decile, you may be paying to influence people who were already going to buy. This is the incrementality problem, and it’s one of the reasons propensity modeling works best when it’s combined with incrementality testing rather than used in isolation.

The same logic applies when you’re evaluating channel performance. If your highest-propensity customers also happen to be heavy email users, your email channel will look like a conversion machine even if the email itself isn’t causing the conversion. Understanding how to measure incrementality in affiliate marketing gives you a practical framework for thinking about this, and the principles transfer directly to propensity-informed campaigns in other channels.

The more rigorous approach is to use propensity scores to identify your high-value segments, then run holdout tests within those segments to establish whether your marketing is actually causing incremental conversions among them. It’s more work, but it’s the difference between a model that looks good and a model that actually improves your economics.

Propensity Modeling in the Context of Newer Measurement Challenges

The measurement environment has changed significantly over the past few years. Signal loss from cookie deprecation, privacy regulation, and platform walled gardens has made individual-level tracking harder. Propensity modeling, which relies on individual-level behavioral data, is affected by this.

The teams that are handling this most effectively are the ones investing in first-party data infrastructure. If you can’t rely on third-party cookies to track individual behavior across the web, your own CRM data, your email engagement data, your on-site behavioral data, and your transaction history become more valuable, not less. Propensity models built on clean, rich first-party data will outperform models built on degraded third-party signals.

There’s also an interesting question about how propensity modeling interacts with newer marketing channels. As AI-generated content and AI avatars become more prevalent in marketing, understanding which audience segments are most likely to engage with those formats is a legitimate modeling problem. The same framework applies: define the outcome, identify the predictive features, score the audience. If you’re thinking about how to evaluate performance in those newer formats, measuring the effectiveness of AI avatars in marketing is a useful reference for the measurement side of that question.

Similarly, as generative engine optimization becomes a real channel for some categories, understanding which audience segments are most likely to discover and engage through those channels has propensity modeling implications. The measurement frameworks are still developing, but the underlying logic is the same. For a sense of where GEO measurement is heading, measuring the success of generative engine optimization campaigns covers the current state of that question.

The Commercial Case for Propensity Modeling

I’ve spent a lot of time over the years making the case for analytical investment to boards and CFOs who want to see a number before they’ll sign off. The commercial case for propensity modeling is straightforward: you’re spending the same budget more efficiently by concentrating it on the people most likely to respond.

The gains compound. Better audience targeting reduces wasted impressions. Reduced wasted impressions improve your cost per acquisition. Improved CPA means the same budget generates more conversions, or the same number of conversions at lower cost. Over time, that compounds into a meaningful structural advantage.

The challenge is that propensity modeling requires upfront investment in data infrastructure, analytical capability, and model development before you see those returns. That’s a harder sell than a new creative campaign, which shows results (or doesn’t) within weeks. Part of building a mature analytics function is making the case for investments that pay off over a longer horizon.

When I grew an agency from 20 to 100 people and moved it from loss-making to a top-five position in its category, a significant part of that was building analytical capabilities that clients couldn’t get elsewhere. Propensity modeling was part of that toolkit. It wasn’t magic, and it wasn’t always easy to explain, but it consistently delivered better media efficiency than campaigns running without it. That’s the commercial case in plain terms.

For teams thinking about how propensity modeling fits into a broader inbound and content strategy, it’s worth connecting the scoring logic to your inbound marketing ROI framework. High-propensity leads generated through inbound channels are a different commercial proposition than high-propensity leads generated through paid. Understanding both helps you allocate across channels more intelligently.

Getting propensity modeling right is one piece of a larger measurement puzzle. If you want to understand how it connects to the broader analytics and GA4 ecosystem, the Marketing Analytics and GA4 hub covers the full range of measurement disciplines that support this kind of work, from event tracking to KPI frameworks to attribution.

A Note on Tooling

You don’t need a data science team to start with propensity modeling, but you do need clean data and someone who understands the fundamentals of model building. Python and R both have well-documented libraries for the algorithms involved. BigQuery ML allows you to run logistic regression and other classification models directly on your GA4 data without exporting it. Platforms like Salesforce Einstein and HubSpot have propensity-style scoring built in, though they’re less transparent about their methodology.

The tooling choice matters less than the data quality and the clarity of the business question you’re trying to answer. I’ve seen teams spend months selecting a platform and then discover that their underlying data isn’t clean enough to support meaningful modeling. Start with the data. The tooling is secondary.

For teams using GA4 as a data source, understanding how to set up custom event tracking properly is a prerequisite for building reliable features for your model. Moz’s guide to GA4 custom event tracking is a solid reference for getting that foundation right. And if you’re building KPI dashboards to communicate model performance to stakeholders, Semrush’s guide to KPI reporting covers how to structure those outputs in a way that drives decisions rather than just documents activity.

One final point. Propensity modeling is not a substitute for good marketing judgment. It’s a tool that makes your judgment more precise. The model tells you who is likely to respond. You still need to know what to say to them, when to say it, and through which channel. Early in my career, when I was teaching myself to code because the MD wouldn’t give me budget for a new website, the lesson I took was that understanding the tools at a technical level makes you a better decision-maker, not just a more capable executor. Propensity modeling rewards the same disposition: understand how it works, understand its limits, and use it in service of a commercial objective rather than as an end in itself.

About the Author

Keith Lacy is a marketing strategist and former agency CEO with 20+ years of experience across agency leadership, performance marketing, and commercial strategy. He writes The Marketing Juice to cut through the noise and share what works.

Frequently Asked Questions

What is propensity modeling in marketing?
Propensity modeling is a statistical technique that assigns a probability score to each individual in your audience, estimating how likely they are to take a specific action such as purchasing, churning, or clicking. The model is trained on historical data where the outcome is known, then applied to current audiences to prioritise targeting and budget allocation.
How is propensity modeling different from lookalike audiences?
Lookalike audiences identify people who share characteristics with your existing customers, measuring similarity. Propensity models measure likelihood to take a specific action based on behavioral signals. A person can look like your best customer but have low propensity to convert right now, and vice versa. They serve different purposes and work best when used at different stages of the funnel.
What data do you need to build a propensity model?
You need historical data that includes both the behavioral features you want to use as predictors (session frequency, recency, product interactions, email engagement) and a clearly defined outcome variable (purchase, churn, conversion). The data needs to be at the individual level and connected across touchpoints. First-party CRM and behavioral data from your own platforms is typically the most valuable input.
How often should a propensity model be retrained?
Propensity models decay over time as customer behavior, market conditions, and product mixes change. There is no universal rule, but most models should be retrained at least quarterly, and more frequently in fast-moving categories or after significant market events. Monitoring model performance on holdout data over time is the most reliable way to identify when retraining is needed.
Can small marketing teams use propensity modeling without a data science team?
Yes, with caveats. Tools like BigQuery ML, HubSpot’s predictive lead scoring, and Salesforce Einstein offer propensity-style scoring without requiring custom model development. Python and R also have accessible libraries for building basic models. The limiting factor is usually data quality rather than technical capability. Clean, connected first-party data is the prerequisite, and that is where smaller teams should invest first.

Similar Posts