Configuration Reference

This document provides a detailed reference for all parameters available in the MMM config.yaml file.

Version: 2.5.0

File Structure Overview

# Data handling
raw_data_granularity: weekly # or daily
train_test_ratio: 0.9 # optional, default 0.9
data_rows: # optional
  total: 156 # optional
  start_date: '2021-01-01' # optional
  end_date: '2023-12-31' # optional

# Column mapping
ignore_cols: # optional
  - unused_column_1
date_col: date # optional, default 'date'
target_col: sales # required
control_columns: # optional
  - competitor_spend
  - is_promo_week
media: # required, list of media channels
  - display_name: TV
    impressions_col: tv_impressions
    spend_col: tv_spend
  - display_name: Search
    impressions_col: search_clicks
    spend_col: search_cost

# Prophet Integration (Required)
prophet:
  include_holidays: true
  holiday_country: 'US'
  yearly_seasonality: true
  trend: true

# Model/Sampler Parameters
tune: 2000 # optional, default 2000
draws: 2000 # optional, default 2000
chains: 4 # optional, default 4
adstock_max_lag: 4 # optional, default 4
target_accept: 0.95 # optional, default 0.95
seed: 42 # optional

# Custom Priors (Optional Section)
custom_priors:
  intercept:
    dist: Normal
    kwargs: { mu: 0, sigma: 2 }
  beta_channel:
    dist: HalfNormal
    kwargs: { sigma: 2 }
  alpha: # Adstock decay rate
    dist: Beta
    kwargs: { alpha: 1, beta: 3 }
  lam: # Saturation parameter
    dist: Gamma
    kwargs: { alpha: 3, beta: 1 }
  likelihood:
    dist: Normal
    kwargs:
      sigma:
        dist: HalfNormal
        kwargs: { sigma: 2 }
  gamma_control: # Control variable coefficients
    dist: Normal
    kwargs: { mu: 0, sigma: 2 }

Parameter Details

Data Handling Options

raw_data_granularity

  • Description: The time frequency of the input data rows.

  • Type: string

  • Allowed Values: "daily", "weekly"

  • Required: Yes

train_test_ratio

  • Description: Proportion of the data (ordered by date) to use for training. The rest is used for out-of-sample testing. Ignored if data_rows with start_date/end_date is used.

  • Type: float

  • Range: 0.5 to 1.0

  • Default: 0.9

  • Required: No

data_rows

  • Description: Defines the specific subset of data to use based on row count or dates. Overrides train_test_ratio if present.

  • Type: object

  • Required: No

  • Properties:

    • total (Optional int): Total number of rows to use from the start of the dataset.

    • start_date (Optional string): Start date (inclusive, format YYYY-MM-DD).

    • end_date (Optional string): End date (inclusive, format YYYY-MM-DD).

Column Definitions

ignore_cols

  • Description: List of column names from the input CSV to exclude from processing.

  • Type: list of string

  • Required: No

date_col

  • Description: Name of the column containing dates. Dates must be in YYYY-MM-DD format.

  • Type: string

  • Default: "date"

  • Required: No

target_col

  • Description: Name of the column representing the target variable (e.g., sales, conversions).

  • Type: string

  • Required: Yes

target_type

  • Description: Specifies the nature of the target variable, influencing naming of key performance metrics.

  • Type: string

  • Allowed Values: "revenue", "conversion"

  • Default: "revenue"

  • Required: No

  • Impact:

    • If "revenue": Outputs use media_performance_roi.csv and media_performance_cost_per_revenue_unit.csv

    • If "conversion": Outputs use media_performance_conversion_efficiency.csv and media_performance_cpa.csv

control_columns

  • Description: List of column names representing control variables or external factors (e.g., competitor activity, promotions). These are included as linear regressors. Columns must be numeric.

  • Type: list of string

  • Required: No

media

  • Description: List defining each media channel to be included in the model.

  • Type: list of object

  • Required: Yes

  • Object Properties:

    • display_name (string, required): Name used for the channel in reports and plots.

    • impressions_col (string, required): Column name containing the volume metric for the channel (e.g., impressions, clicks, GRPs).

    • spend_col (string, required): Column name containing the cost/spend data for the channel.

    • learned_prior (float, optional): Custom prior value for this channel’s effectiveness. Overrides the default cost-based prior when specified.

Prophet Integration

prophet

  • Description: Configuration for Prophet seasonality and trend decomposition. Prophet integration is required.

  • Type: object

  • Required: Yes

  • Properties:

    • include_holidays (bool): Whether to include holidays in the model.

    • holiday_country (string): Country code for holiday calendar (e.g., ‘US’, ‘GB’).

    • yearly_seasonality (bool): Enable yearly seasonality component.

    • weekly_seasonality (bool): Enable weekly seasonality component.

    • daily_seasonality (bool): Enable daily seasonality component.

    • trend (bool): Enable trend component.

Model/Sampler Parameters

tune

  • Description: Number of tuning (burn-in) steps for the MCMC sampler.

  • Type: int

  • Default: 2000

  • Required: No

draws

  • Description: Number of sampling steps to perform after tuning for each chain.

  • Type: int

  • Default: 2000

  • Required: No

chains

  • Description: Number of independent MCMC chains to run. Multiple chains are essential for assessing convergence.

  • Type: int

  • Default: 4

  • Required: No

adstock_max_lag

  • Description: Maximum number of time periods over which advertising effects can carry over.

  • Type: int

  • Range: 1 to 52 (Warning if > 26)

  • Default: 4

  • Required: No

target_accept

  • Description: Target acceptance rate for NUTS sampler. Controls step size adaptation.

  • Type: float

  • Range: 0.6 to 0.99

  • Default: 0.95

  • Required: No

seed

  • Description: Integer seed for the random number generator, ensuring reproducibility.

  • Type: int

  • Required: No

Custom Priors

This section is optional. If omitted, default priors are used.

Description: Allows overriding default prior distributions for model parameters.

Type: object

Required: No

Properties (Parameter Groups):

  • intercept: Prior for the base intercept term.

  • beta_channel: Prior for the effectiveness coefficients of media channels.

  • alpha: Prior for the decay rate parameter in the adstock transformation (typically Beta distribution).

  • lam (or saturation_beta): Prior for the saturation parameter in the logistic saturation function (typically Gamma or HalfNormal).

  • likelihood: Defines the observation distribution and its parameters (e.g., sigma for Normal likelihood).

  • gamma_control: Prior for the coefficients of control variables.

Structure:

  • dist (string, required): Name of the PyMC distribution (e.g., “Normal”, “HalfNormal”, “Beta”, “Gamma”, “Laplace”, “LogNormal”).

  • kwargs (object, required): Dictionary of keyword arguments for the distribution.

Parameter Aliases

The configuration system supports user-friendly parameter aliases that map to internal mathematical parameter names:

User-Friendly Name

Internal Name

Description

saturation_beta

lam

Saturation parameter for logistic saturation function

adstock_alpha

alpha

Decay rate parameter for adstock transformation

media_coefficients

beta_channel

Media channel effectiveness coefficients

control_coefficients

gamma_control

Control variable coefficients

error_sigma

likelihood.sigma

Error term standard deviation

Configuration Validation

The system includes automatic validation of configuration parameters:

  1. Parameter Name Validation: Checks that all parameter names are valid or have valid aliases

  2. Typo Detection: Suggests corrections for misspelled parameter names using fuzzy matching

  3. Distribution Validation: Verifies that specified distributions exist in PyMC

  4. Fail-Fast behaviour: Invalid configurations raise immediate errors

Common Validation Errors:

Invalid Parameter Name:

ValueError: Invalid parameter 'saturaton_beta'. Did you mean 'saturation_beta'?

Invalid Distribution:

ValueError: Invalid distribution 'Norml'. Did you mean 'Normal'?

Missing Required Arguments:

ValueError: Distribution 'Beta' requires arguments: alpha, beta