Building the self-driving ad campaign, part 2

Christopher Martin
Chartboost Engineering
4 min readDec 15, 2022

--

In part 1 of this series, we described the challenge of automated daily campaign management for the Chartboost bidder: consistently achieve desired spend and ROAS goals, while accounting for Chartboost’s platform fee. In part 2, we describe how we designed the solution.

Intraday controller: responsive in real time

We can adjust bid multiplier levels, cost per install (CPI), and fees to influence the balance of spend, capacity, and margin. However, these are coarse, directional, laggy levers, carrying different impacts by campaign and time of day. To achieve precision and accuracy in our daily outcome, we turn to a discrete state-based controller that checks real-time campaign progress at short time intervals, classifies it into one of dozens of states, and applies a prescribed adjustment to one or more of the inputs. The thin-slicing of time intervals and adjustment sizes are critical to maintaining campaign control throughout the day and to enable immediate shutoff when a goal is reached.

Input range control

Understanding our campaigns’ bid elasticities is a broad problem that has engaged data scientists for years at Chartboost. We aimed to keep the problem framed as simply as possible, and our solution has proven effective. We treat runaway spend as a “failure,” and the level of the input that leads to it, such as a bid multiplier, as a time-to-failure. (It is, of course, not a time but an exposure level.)

Assuming that these “failure times” are exponentially distributed and reflective of a campaign’s intrinsic elasticity elbow point, we infer that this underlying elbow point is gamma distributed. We then use the lower bound of a 68% credible interval as a soft limit for the bid multiplier: past this point, we allow bid multiplier increases, but with a smaller incremental step size. We update the gamma distribution parameters based on observations of either hitting an elbow point or of safely proceeding past the elbow point estimate. We only reduce the incremental step size when we are within a credible interval of the expected value; this allows us to exploit campaigns with well-evidenced elasticity elbows and move more gingerly otherwise.

Chart 3: Progressively updated gamma distributions and lower credible interval bounds, based on new observations of “highest bid multipliers without extreme spend.”

The result of this gamma-updating, continuous-learning approach has been to eliminate runaway spend from campaigns under management without any hard limits to on our input levers, and without taking on an unnecessarily complex modeling project.

System architecture

In the days of human manipulation of campaigns, bid multipliers, etc, were adjusted a few times per week at most. Converting to a system running 96 times per day required enhancing the bidder infrastructure (diagram 1).

Diagram 1: Campaign automation architecture

To begin with, we augmented our real-time database to capture and aggregate impression-level predictions of spend and install probability so that we did not have to wait for this logged data to land in our data warehouse. We streamlined the loading of campaign information to the pricer so that a bid multiplier increase/decrease is propagated within minutes.

Immediate shutoff of campaigns (upon reaching daily goals) is challenging for two reasons. First, there is a delay in impression achievement: impressions that are won can be cached and served later, so we must account for a small “braking distance.” Second, shutoff requires instant communication with hundreds of pricer nodes.

To stop on a dime, we flip a flag in Redis when a campaign is nearly at full attainment and post a message in a Kafka topic that each pricer node subscribes to; this message tells the pricer to check Redis for the updated set of campaigns to negatively target. Kafka and Redis together give us a backdoor to the demand-side configuration that otherwise takes time to re-compile.

A business-critical product

With more than 60 percent of our business now under management, campaign automation is a business-critical feature of our client offering. Beyond the logical design, we have made efforts to strengthen product observability and resilience.

We’ve planned for a number of scenarios that either have occurred or could occur. We send numerous metrics to DataDog, which we use for outlier detection, and are alerted upon job failure, data absence, or unreasonable job duration.

We also need to ensure that unexpected bidder pauses do not cause runaway increases of the controller input values (per the feedback loop). To ensure resilience during downtime, we maintain details about the state of the bidder in Redis and automatically pause input increases without human intervention. With Redis as a circuit breaker, the system (or a human) can quickly pause and unpause the controller without needing to make a code or config file change.

Results and next steps

The current campaign automation product has proven effective in achieving daily spend and capacity targets within tolerance limits never achieved previously at Chartboost. The dramatic reduction in campaign performance volatility has helped our engineering teams uncover optimization opportunities within our platform and allowed the data science/product team to begin fine-tuning our daily campaign pacing. We expect to continue posting about interesting data science problems and our approach to solving them.

Campaign automation at Chartboost has been a team effort among data scientists Christopher Martin, Alireza Samsamshariat, and Nithish Bolleddula.

--

--