Guide: Configuring the Model Run¶

Version: 2.5.0

This guide explains how to set up the YAML configuration file required to run the ammm Media Mix Model.

Overview¶

The configuration file defines how ammm handles your data, model parameters, and components like seasonality and control variables.

Quick Start¶

Copy an Example: Start with one of the example configurations below.
Prepare Data: Ensure your input CSV file is ready, following the Data Preparation Guide.
Set Core Options:
- Define raw_data_granularity (daily or weekly).
- Specify the target_col (your main outcome metric, e.g., sales, conversions).
- List your media channels with display_name, impressions_col, and spend_col for each.
Configure Prophet: Set up the required prophet section for seasonality and trend.
Specify Data Range: Use train_test_ratio or start_date/end_date under data_rows.
Run the Model: Execute using the driver script (e.g., demo/runme.py).
Iterate: Review results and adjust parameters based on model diagnostics.

Configuration File Structure¶

The YAML file includes these main sections:

Data Handling Options (data_rows): Controls how the input data is read and split.
Column Definitions: Maps column names to their roles (target_col, media, control_columns, ignore_cols).
Prophet Integration: Required configuration for seasonality and trend decomposition.
Model Parameters: Sets parameters influencing model fitting (sampler settings, adstock).
Custom Priors (custom_priors): Optional Bayesian priors for model parameters.

For detailed parameter explanations, see the Configuration Reference.

Key Sections Explained¶

Run Mode Selection¶

AMMM supports two run modes:

scenario_planning (default): Evaluates multiple budget allocation scenarios. Generates scenario comparison files (scenario_*.png, budget_scenario_results.csv).
optimization: Finds mathematically optimal budget allocation. Generates optimization results (optimization_results.csv, budget_optimisation.png).

run_mode: "scenario_planning"  # or "optimization"

Both modes generate the complete set of standard outputs including model diagnostics, decomposition, and performance metrics.

Data Handling Options (`data_rows`)¶

Use train_test_ratio (e.g., 0.8 for 80% train, 20% test) for simple time-based splits. A value of 1.0 uses all data for training.
Alternatively, use start_date and end_date (YYYY-MM-DD format) to select a specific time window.

Column Definitions¶

date_col: Name of the date column in your CSV (defaults to “date”).
target_col: The main metric to predict/explain.
target_type: Specifies “revenue” or “conversion”. Influences naming of output metrics. Defaults to “revenue”.
media: Define each marketing channel. impressions_col is used in model transformation, spend_col for ROI calculation, display_name for reporting.
control_columns: External factors that influence the target (e.g., competitor spend, promotions). Must be numeric. Prophet-generated components are automatically added.
extra_features_impact: Optional dictionary mapping control columns to 'negative' to invert their values (e.g., {competitor_spend: 'negative'}).
ignore_cols: Columns to exclude from processing.

Model Parameters¶

Sampler Settings (tune, draws, chains, target_accept): Control MCMC sampling. See Performance Guidelines for recommendations.
adstock_max_lag: Maximum carry-over period for marketing effects. Choose based on expected advertising impact duration (e.g., 4-12 weeks for weekly data, 7-28 days for daily data).

Prophet Integration (`prophet`)¶

Prophet integration is required for seasonality and trend decomposition. Prophet runs first, and its components are added as features for the main PyMC model.

holiday_country: Country code for holiday calendar (e.g., ‘US’, ‘GB’).
include_holidays: Set to true to include holidays.
daily_seasonality, weekly_seasonality, yearly_seasonality: Enable specific seasonality components.
trend: Enable trend component.
Automatic Feature Addition: Prophet components (trend, daily, weekly, yearly, holidays) are automatically added to control_columns.

Note: All seasonality is handled through Prophet, which uses Fourier decomposition internally.

Custom Priors (`custom_priors`)¶

This section allows you to override default Bayesian priors. Use this if you have strong prior beliefs based on domain knowledge or previous studies. See Prior Selection Guidelines below and the Configuration Reference.

Performance Guidelines¶

Adjust sampler settings based on your dataset size:

Small (< 100 rows): tune: 1000, draws: 1000, chains: 2, target_accept: 0.8
Medium (100-500 rows): tune: 2000, draws: 2000, chains: 4, target_accept: 0.95
Large (> 500 rows): tune: 4000, draws: 4000, chains: 4, target_accept: 0.95

Be mindful of memory usage when increasing draws and chains.

Prior Selection Guidelines¶

intercept: Often LogNormal if target > 0, Normal otherwise. Scale based on target magnitude.
beta_channel: HalfNormal is common (assumes positive effect). Adjust sigma based on expected effect range.
alpha (Adstock Decay): Beta distribution (constrains between 0-1). Higher alpha means faster decay.
lam (Saturation): Gamma or HalfNormal. Controls how quickly saturation occurs.
likelihood['kwargs']['sigma']: HalfNormal is common. Represents unexplained variance.
gamma_control: Often Normal or Laplace. Adjust scale based on expected impact size.

Note: Seasonality is handled by Prophet integration. Configure through the prophet: section rather than custom priors.

Troubleshooting¶

Convergence Issues: Increase tune/draws, adjust target_accept, check data quality, review priors.
Memory Errors: Reduce draws/chains.
Poor Fit: Check data quality, add relevant control_columns, adjust priors.

Example Configurations¶

Basic Weekly Model¶

run_mode: "scenario_planning"

data_rows:
  total: 171
  start_date: 2019-07-28
  end_date: 2022-10-30

raw_data_granularity: weekly
train_test_ratio: 1.0

transformations:
  scale_controls: true

ignore_cols: 
  - "other_events"

date_col: "date"
target_col: "subscribers"
target_type: "revenue"

extra_features_cols:
  - "covid_index"
  - "price"
  - "promo_events"
  - "competitor_spend"

extra_features_impact:
  "competitor_spend": "negative"

media:
  - display_name: "Media Channel 1 (BVOD)"
    impressions_col: media_imp_1
    spend_col: media_cost_1

  - display_name: "Media Channel 2 (Performance)"
    impressions_col: media_imp_2
    spend_col: media_cost_2

  - display_name: "Media Channel 3 (Display)"
    impressions_col: media_imp_3
    spend_col: media_cost_3

  - display_name: "Media Channel 4 (Social)"
    impressions_col: media_imp_4
    spend_col: media_cost_4

  - display_name: "Media Channel 5 (Digital Audio)"
    impressions_col: media_imp_5
    spend_col: media_cost_5

  - display_name: "Media Channel 6 (Linear Radio)"
    impressions_col: media_imp_6
    spend_col: media_cost_6

tune: 2000
draws: 4000
chains: 4
adstock_max_lag: 12
target_accept: 0.95

prophet:
  include_holidays: true
  holiday_country: 'US'
  yearly_seasonality: true
  trend: true

seed: 42

custom_priors:
  intercept:
    dist: LogNormal
    kwargs:
      mu: 0
      sigma: 5
  beta_channel:
    dist: HalfNormal
    kwargs:
      sigma: 1
  saturation_beta:
    dist: HalfStudentT
    kwargs:
      nu: 3
      sigma: 2
  adstock_alpha:
    dist: Beta
    kwargs:
      alpha: 1
      beta: 3
  saturation_lam:
    dist: Gamma
    kwargs:
      alpha: 3
      beta: 1
  likelihood:
    dist: Normal
    kwargs:
      sigma:
        dist: HalfStudentT
        kwargs:
          nu: 3
          sigma: 2
  gamma_control:
    dist: HalfStudentT
    kwargs:
      nu: 3
      sigma: 1