Explanation: Core MMM Methodology¶
Version: 2.5.0
Marketing Mix Modelling (MMM) quantifies the impact of marketing activities on sales or other KPIs. AMMM provides a Bayesian framework for building flexible, interpretable MMM models.
Core Principles¶
Bayesian Approach: AMMM uses PyMC to treat model parameters as probability distributions, providing uncertainty quantification rather than point estimates. This yields credible intervals (HDIs) and enables incorporation of prior knowledge.
Flexibility: Customizable media transformations (adstock, saturation), control variables, and prior distributions allow models to reflect business reality.
Interpretability: Model outputs include channel-specific coefficients, response curves, ROI estimates, and contribution decomposition.
Actionability: Results directly inform budget optimization and scenario planning.
Bayesian Framework¶
Key Concepts¶
Parameters as Distributions: Model parameters (channel effectiveness, adstock rates, saturation points) are probability distributions reflecting uncertainty about their true values.
Prior Distributions: Express initial beliefs about parameters before observing data. Common choices:
HalfNormal: For positive-only parameters (effectiveness coefficients, error terms)Beta: For parameters between 0-1 (adstock retention rates)Gamma: For positive parameters with specific shapes (saturation parameters)Normal: For unconstrained parameters
Likelihood Function: Quantifies how probable the observed data is given parameter values.
Posterior Distributions: Updated beliefs about parameters after observing data, obtained via Bayes’ Theorem:
P(parameter | Data) ∝ P(Data | parameter) × P(parameter)
MCMC Sampling: PyMC uses NUTS (No-U-Turn Sampler) to draw samples from posterior distributions, which are typically too complex for analytical solutions.
Benefits¶
Credible intervals (HDIs) for all parameters
Full posterior distributions for detailed inspection
Direct probability statements about parameters
Formal incorporation of prior knowledge
Core Model Equation¶
The typical AMMM model structure:
y_t = baseline_t + Σ[β_m · saturation(adstock(x_m,t))] + Σ[γ_c · z_c,t] + ε_t
Components¶
Target Variable (y_t): The outcome being modelled (sales, revenue, conversions).
Baseline (baseline_t): Expected value when marketing spend is zero:
Intercept (α): Fundamental base level
Trend: Long-term patterns
Seasonality: Handled by Prophet integration (required)
Yearly, weekly, daily patterns
Holiday effects
Prophet uses Fourier decomposition internally
Media Inputs (x_m,t): Raw marketing effort (spend, impressions, GRPs).
Media Transformations:
Adstock (Carry-over Effects): Models lagged advertising impact
Geometric adstock:
adstocked_t = (1-θ) × input_t + θ × adstocked_{t-1}Parameter θ (retention rate): 0-1, typically has Beta prior
adstock_max_lag: Maximum periods for carry-over calculation
Saturation (Diminishing Returns): Models non-linear response
Michaelis-Menten: Hyperbolic response curve
Logistic: S-shaped response curve
Parameters control shape and steepness, typically Gamma or HalfNormal priors
Channel Effectiveness (β_m): Scales transformed media input. Represents marginal impact on target. Typically HalfNormal prior (positive effect).
Control Variables (z_c,t): External factors (promotions, competitor activity, economic indicators). Coefficients γ_c have Normal or HalfNormal priors.
Error Term (ε_t): Random variation not explained by model. Typically Normal(0, σ_error) with HalfNormal prior on σ.
Prior Specification¶
Priors guide model estimation and reflect domain knowledge:
Channel Effectiveness (β_m):
HalfNormal: For positive effects
Inform based on past studies, domain expertise, or lift tests
Adstock (θ_m):
Beta distribution (0-1 constraint)
Digital channels: Lower values (shorter memory)
Traditional media: Higher values (longer memory)
Saturation (λ_m):
Gamma or HalfNormal
Inform using lift test results when available
Otherwise use weakly informative priors
Best Practices:
Start with weakly informative priors
Visualize priors before fitting
Test sensitivity to prior choices
Document reasoning
Model Fitting¶
MCMC Process:
Initialize parameter values
Run multiple independent chains (typically 4)
Tuning phase: Sampler adapts (typically 2000 iterations)
Sampling phase: Collect posterior samples (typically 2000+ draws)
Key Parameters:
draws: Posterior samples per chaintune: Tuning iterationschains: Number of independent chainstarget_accept: Acceptance rate (0.8-0.99)
Convergence Diagnostics:
R-hat: Should be ≈1.0 (< 1.01 ideal)
Compares within-chain vs between-chain variance
Values > 1.05 indicate non-convergence
Effective Sample Size (ESS): Should be > 100-400
Accounts for autocorrelation
Low ESS indicates inefficient sampling
Trace Plots: Should show “fuzzy caterpillar”
Horizontal band (stationarity)
Good mixing between chains
No trends or patterns
Divergences: Should be zero or minimal
Indicate sampler instability
Fix by increasing
target_accept
Model Outputs¶
Channel Coefficients (β_m):
Magnitude indicates effectiveness
HDI indicates uncertainty
If HDI excludes zero, strong evidence of effect
Response Curves:
Visualize diminishing returns
Identify optimal spend levels
Compare channel dynamics
ROI Metrics:
Overall ROI: Total contribution / total spend
Marginal ROI (mROI): Return on next dollar spent
Use mROI for budget allocation decisions
Contribution Analysis:
Decomposes target into components
Shows baseline vs marketing impact
Tracks contribution over time
Budget Optimization:
Allocates budget to maximize returns
Uses response curves and mROI
Supports constraints on channel spend
Scenario Planning:
Tests “what-if” scenarios
Predicts outcomes under different budgets
Provides uncertainty intervals
Prophet Integration¶
Prophet handles all seasonality and trend decomposition (required component):
Configuration:
yearly_seasonality: Annual patterns (all data frequencies)weekly_seasonality: Day-of-week patterns (for daily or sub-daily data)daily_seasonality: Hour-of-day patterns (for sub-daily/intra-day data)trend: Long-term trendsinclude_holidays: Holiday effects
Data Frequency Guidance:
Weekly data: Use
yearly_seasonality=True, setweekly_seasonality=Falseanddaily_seasonality=FalseDaily data: Use
yearly_seasonality=Trueandweekly_seasonality=True, setdaily_seasonality=FalseIntra-day data: Use all three seasonality parameters as needed
Prophet components are automatically added to the model as control variables.
Convergence Best Practices¶
If convergence fails:
Increase
tuneanddrawsIncrease
target_accept(0.95-0.99)Use more informative priors
Simplify model (fewer channels/features)
Check data quality
For large datasets:
Use more chains and draws
Increase computational resources
Monitor memory usage
For complex models:
Start simple, add complexity gradually
Validate at each step
Document changes
References¶
See also: