Summary & FAQ
Incrementality Measurement Must-Haves
-
A Desire to Learn: Start with a clear objective, hypothesis, and a willingness to adjust account structure / strategy for at least 30 days to answer your questions.
-
Data: Ensure you have long-term geo level business data that can be collected by the Thrive data team. This is essential to kick off the modelling process prior to launching any incrementality test.
-
Alignment: Any changes to test design will require re-adjusting and validating our measurement models. Close alignment on budget, hypotheses, acceptable risk and account structure are key to making this process run smoothly and efficiently!
FAQ
Q: Can we make account changes during the test period? : ex. increase / decrease spend, audience exclusions, bids?
A: Incrementality tests are meant to provide a snapshot of the impact of advertising activity, which will naturally include general optimizations like budget + bid adjustments, campaign launches etc. All of this is okay, so long as your aren’t making drastic strategic shifts (ex - shifting from 10% to 50% BoF spend mid test), or making heavy budget changes which could impede measuring an effect. Check in with the data team to understand what changes are okay.
Q: What if something happens during the test like a spike in the stock market or a new product launch?
A: We aren’t measuring before/after changes, but the difference between test and predicted test performance (based on holdout group performance). With balanced geo groups, we expect that the spikes and dips arising from general business, macro, or seasonal events and trends will be mirrored across both groups. If something unique does happen to particular geo(s) in one group but not in the other (ex retail store launch or closure) we would explore these circumstances and adjust our models as needed.
Q: Can we pick our test geos manually?
A: Yes this is possible, but we typically recommend that we select geos via statistical clustering to reduce bias and ensure we end up with test and holdout groups that are:
- Historically predictive of one another across primary KPIs
- Broad enough to minimize noise
Q: Why can’t we directly compare the difference between holdout and test group performance?
A: No geos or groups of geos will have a perfect 1:1 relationship, which is what we would be assuming if we compared their results directly. For example, if we run a test on 10% of US DMAs and our holdout is the other 90%, we wouldn’t expect the same scale of results. What’s more, the test group in this example is likely to show more volatility or noise in it’s results, being a smaller sample. Models factor in different trends in the relationship between test and holdout results over time to create a far more accurate prediction of how the test group will perform relative to the holdout.