Pre-Diagnostics Guide¶
Overview¶
The pre-diagnostics module provides automated validation of MMM inputs before model fitting. These tests help identify potential data quality issues that could affect model reliability.
Pre-diagnostics run automatically by default as part of the standard AMMM pipeline, during the DATA EXPLORATION phase.
What is Tested¶
1. Stationarity Tests (ADF + KPSS)¶
Purpose: Assess whether the dependent variable (target) exhibits stationarity or a unit root.
Why it matters: Non-stationary time series can lead to spurious correlations and unreliable inference in regression models.
Tests performed:
Augmented Dickey-Fuller (ADF): Tests the null hypothesis of a unit root
Kwiatkowski-Phillips-Schmidt-Shin (KPSS): Tests the null hypothesis of stationarity
Interpretation:
ADF Result |
KPSS Result |
Conclusion |
|---|---|---|
Reject H₀ (p < 0.05) |
Fail to reject H₀ (p ≥ 0.05) |
Likely stationary |
Fail to reject H₀ (p ≥ 0.05) |
Reject H₀ (p < 0.05) |
Likely unit root |
Other combinations |
Other combinations |
Inconclusive |
Remediation if unit root detected:
First differencing: Δy_t = y_t - y_{t-1}
Detrending: Remove linear or polynomial trends
Log transformation: For multiplicative trends
Note: By design, only the target variable is tested for stationarity. There is no requirement for regressors to be stationary in typical MMM applications.
2. Variance Inflation Factor (VIF)¶
Purpose: Detect multicollinearity among regressors (media spend channels + control variables).
Why it matters: High multicollinearity inflates coefficient variance, making it difficult to isolate individual channel effects.
Interpretation:
VIF Value |
Severity |
Action |
|---|---|---|
VIF < 5 |
Low multicollinearity |
No action needed |
5 ≤ VIF < 10 |
Moderate multicollinearity |
Monitor closely |
VIF ≥ 10 |
High multicollinearity |
Flagged - consider remediation |
Remediation if high VIF detected:
Remove or combine highly correlated channels
Principal Component Analysis (PCA) on correlated features
Ridge regression or other regularisation techniques
Domain knowledge to select most important variables
Additional metrics:
Tolerance (1/VIF): Lower values indicate higher multicollinearity
Correlation matrix (max): Highest pairwise correlation for each variable
3. Transfer Entropy¶
Purpose: Detect directional information flow between media channels (X) and the target variable (Y).
Why it matters: Transfer entropy provides a non-linear, model-free measure of predictive relationships, complementing traditional correlation analysis.
What is computed:
TE(X→Y): Information flow from channel X to target Y
TE(Y→X): Information flow from target Y to channel X
p-values: Statistical significance via permutation test (200 permutations by default)
Direction classification:
Condition |
Direction |
Interpretation |
|---|---|---|
TE(X→Y) significant AND TE(X→Y) > TE(Y→X) |
x→y |
X likely predicts Y |
TE(Y→X) significant AND TE(Y→X) > TE(X→Y) |
y→x |
Y likely predicts X (reverse causality?) |
Both significant |
bidirectional |
Mutual predictive relationship |
Neither significant |
none |
No strong directional relationship |
Important Caveats:
⚠️ This implementation uses pairwise (unconditional) transfer entropy
Does NOT control for confounding variables
Cannot establish true causality
May detect spurious relationships due to common drivers
⚠️ Interpretation guidance:
Use TE as an exploratory tool, not confirmatory evidence
Significant TE(X→Y) suggests X may have predictive value for Y
Always combine with domain knowledge and theoretical understanding
For rigourous causal analysis, consider conditional TE or structural models
Optional: Include control variables in TE analysis by setting te_include_controls_in_x=True in the orchestrator function.
Output Files¶
All diagnostics save results to results/csv/.
For the complete specification of each CSV (column names and meanings), see the Reference Output Files page:
1. stationarity_summary.csv¶
Column |
Description |
|---|---|
|
Variable name (target column) |
|
ADF test statistic |
|
ADF p-value |
|
Number of lags used in ADF test |
|
Number of observations used |
|
KPSS test statistic |
|
KPSS p-value |
|
Number of lags used in KPSS test |
|
Boolean: ADF rejects unit root (p < 0.05) |
|
Boolean: KPSS rejects stationarity (p < 0.05) |
|
Combined interpretation |
See reference: stationarity_summary.csv
2. vif_summary.csv¶
Column |
Description |
|---|---|
|
Variable name |
|
Variance Inflation Factor |
|
1 / VIF |
|
Maximum absolute pairwise correlation |
|
Boolean: VIF > 10 |
See reference: vif_summary.csv
3. transfer_entropy_summary.csv¶
Column |
Description |
|---|---|
|
Predictor variable name |
|
Transfer entropy from X to Y |
|
Transfer entropy from Y to X |
|
p-value for X→Y |
|
p-value for Y→X |
|
Boolean: p_x_to_y < 0.05 |
|
Boolean: p_y_to_x < 0.05 |
|
Directional classification |
See reference: transfer_entropy_summary.csv
Quick read example¶
import pandas as pd
stationarity = pd.read_csv('results/csv/stationarity_summary.csv')
vif = pd.read_csv('results/csv/vif_summary.csv')
te = pd.read_csv('results/csv/transfer_entropy_summary.csv')
print(stationarity.head())
print(vif.sort_values('vif', ascending=False).head())
print(te.head())
Integration¶
Automatic Execution¶
Pre-diagnostics run automatically when you execute:
python runme.py
The diagnostics execute during the DATA EXPLORATION phase, after media spend visualisations and before model fitting.
Programmatic Usage¶
You can also run diagnostics independently:
from src.diagnostics.pre_diagnostics import run_all_pre_diagnostics
import pandas as pd
# Load your data and config
data = pd.read_csv('your_data.csv')
config = {
'date_col': 'date',
'target_col': 'sales',
'media': [
{'display_name': 'TV', 'spend_col': 'tv_spend'},
{'display_name': 'Digital', 'spend_col': 'digital_spend'}
],
'extra_features_cols': ['price', 'competitor_activity']
}
# Run all diagnostics
result_paths = run_all_pre_diagnostics(
data=data,
config=config,
results_dir='results'
)
# Print saved file paths
for filename, path in result_paths.items():
print(f"{filename}: {path}")
Individual Tests¶
You can run tests individually for more control:
from src.diagnostics.pre_diagnostics import (
run_stationarity_tests,
run_vif_tests,
run_transfer_entropy
)
# Stationarity test on target only
stationarity_df = run_stationarity_tests(
data=data,
date_col='date',
cols=['sales']
)
# VIF test on regressors
vif_df = run_vif_tests(
data=data,
cols=['tv_spend', 'digital_spend', 'price']
)
# Transfer entropy
te_df = run_transfer_entropy(
data=data,
date_col='date',
x_cols=['tv_spend', 'digital_spend'],
y_col='sales',
permutations=200 # Configurable
)
Advanced Configuration¶
# Include controls in transfer entropy analysis
result_paths = run_all_pre_diagnostics(
data=data,
config=config,
results_dir='results',
te_include_controls_in_x=True, # Test controls → target
te_kwargs={'permutations': 500, 'bins': 10} # Custom TE settings
)
# Custom stationarity test settings
result_paths = run_all_pre_diagnostics(
data=data,
config=config,
results_dir='results',
stationarity_kwargs={
'adf_regression': 'ct', # Include trend in ADF
'kpss_regression': 'ct' # Include trend in KPSS
}
)
Error Handling¶
The pre-diagnostics module is designed to be non-fatal:
If a diagnostic fails, it writes error information to the CSV
The pipeline continues with model fitting
Warnings are logged for non-critical issues (e.g., constant series, insufficient data)
Performance Considerations¶
Transfer Entropy is the most computationally expensive test
Default settings: 200 permutations, 8 bins
For large datasets or many channels, consider:
Reducing permutations (min 50 for exploratory analysis)
Running TE separately on a subset of channels
Using fewer bins for discretisation
Typical runtime for demo data (~80 weeks, 7 channels):
Stationarity: <1 second
VIF: <1 second
Transfer Entropy: 10-30 seconds
Best Practices¶
Always review stationarity results: Non-stationary targets can invalidate regression assumptions
Flag high VIF early: Multicollinearity issues are easier to address before model fitting
Use TE as exploratory tool: Complement with domain knowledge and economic theory
Document findings: Keep notes on which diagnostics flagged issues and how you addressed them
Iterate: Run diagnostics again after data transformations or feature engineering
References¶
Stationarity: Dickey & Fuller (1979); Kwiatkowski et al. (1992)
VIF: Marquaridt (1970); O’Brien (2007)
Transfer Entropy: Schreiber (2000); Bossomaier et al. (2016)
Limitations and Future Extensions¶
Current limitations:
TE is pairwise (unconditional) only
No automatic remediation suggestions
Fixed significance threshold (α = 0.05)
Planned extensions:
Conditional transfer entropy (control for confounders)
Multivariate TE
Automated data transformation recommendations
Time-varying diagnostics (rolling window analysis)
For questions or issues, please consult the main AMMM documentation or raise an issue on GitHub.