Sketch: Depict (depict.py)

High-level orchestrators that generate plots and CSV summaries for training and prediction diagnostics, and helpers to export reproducible artifacts.

Main Reporting Functions

These functions generate multiple plots and save summary data to the specified results directory.

describe_mmm_training

def describe_mmm_training(
    config: dict,
    processed_data: pd.DataFrame,
    mmm, # Fitted MMM instance
    results_dir: str
) -> None:

Generates and saves a suite of plots and summaries related to model fit on the training data.

Generated Outputs (saved in results_dir):

  • model_summary.csv — ArviZ summary statistics (median, HDI) for key parameters (intercept, likelihood_sigma, beta_channel, alpha, lam).

  • all_decomp.csv — Mean contributions of all components over time, with a date column.

  • media_contribution_mean.png, media_contribution_median.png — Channel contribution stats.

  • weekly_media_and_baseline_contribution.png — Stacked baseline + channels over time.

  • weekly_media_contribution.png — Channels-only contributions over time.

  • model_priors_and_posteriors.png — Trace plots for key parameters.

  • response_curves.png — Direct contribution curves for each channel.

  • ROI/Efficiency visuals:

    • media_performance_mean.png, media_performance_median.png

    • performance_distribution.png

Parameters:

  • config (dict): The configuration dictionary.

  • processed_data (pd.DataFrame): The processed input data used for fitting.

  • mmm: The fitted MMM instance (e.g., DelayedSaturatedMMM).

  • results_dir (str): The directory path where output files will be saved.

Returns:

  • None.


describe_mmm_prediction

def describe_mmm_prediction(
    config: dict,
    input_data_processed: pd.DataFrame,
    mmm, # Fitted MMM instance
    results_dir: str
) -> None:

Generates and saves plots and summaries related to predictive performance, including optional out-of-sample checks when a train/test split is used.

Generated Outputs (saved in results_dir):

  • waterfall_plot_components_decomposition.png — Components decomposition waterfall.

  • model_fit_predictions.png — Actuals vs posterior predictive (with HDI fans) and residuals.

  • media_performance_effect.csv — Summary statistics for beta_channel.

  • ROI/Cost per Target CSVs (depending on target_type):

    • Revenue target: media_contribution_per_spend.csv, media_cost_per_revenue_unit.csv

    • Conversion target: media_conversion_efficiency.csv, media_cost_per_conversion.csv

Parameters:

  • config (dict): The configuration dictionary (used to get train_test_ratio).

  • input_data_processed (pd.DataFrame): The full processed input data (used for splitting).

  • mmm: The fitted MMM instance.

  • results_dir (str): The directory path where output files will be saved.

Returns:

  • None.


describe_input_data

def describe_input_data(
    input_data, # InputData instance
    results_dir: str,
    suffix: str
) -> None:

Generates plots and reports describing the raw input data.

Generated Outputs (saved in results_dir):

  • metrics_{suffix}.png — All input metrics over time (media volumes, costs, extras, target).

  • Outlier report printed/saved by the preprocessing module.

Parameters:

  • input_data: An instance of ammm.prepro.input_data.InputData.

  • results_dir (str): The directory path where output files will be saved.

  • suffix (str): A suffix (e.g., “raw”) added to output filenames.

Returns:

  • None.


describe_config

def describe_config(output_dir: str, config: str, git_sha: str) -> None:

Saves configuration details to facilitate reproducibility.

Generated Outputs (saved in output_dir):

  • git_sha.txt: Contains the Git commit hash of the code version used.

  • config.yaml: Contains the raw string content of the configuration file used for the run.

Parameters:

  • output_dir (str): The directory path where output files will be saved.

  • config (str): The raw string content of the YAML configuration file.

  • git_sha (str): The Git commit hash.

Returns:

  • None.

Helper/Metric Functions

compute_roi_summary

def compute_roi_summary(model, data: pd.DataFrame, config: dict) -> pd.DataFrame:

Computes summary statistics (mean, median, 5th/95th percentiles) for the Return on Investment (ROI) for each channel.

Parameters:

  • model: The fitted MMM instance.

  • data (pd.DataFrame): The input data containing channel spend columns.

  • config (dict): Configuration dictionary specifying media spend columns.

Returns:

  • pd.DataFrame: DataFrame indexed by channel name (plus ‘blended’), with columns ‘0.05’, ‘0.95’, ‘median’, ‘mean’ representing ROI statistics.


compute_cost_per_target_summary

def compute_cost_per_target_summary(model, data: pd.DataFrame, config: dict) -> pd.DataFrame:

Computes summary statistics (mean, median, 5th/95th percentiles) for the Cost Per Target (e.g., cost per acquisition) for each channel. This is calculated as the inverse of ROI.

Parameters:

  • model: The fitted MMM instance.

  • data (pd.DataFrame): The input data containing channel spend columns.

  • config (dict): Configuration dictionary specifying media spend columns.

Returns:

  • pd.DataFrame: DataFrame indexed by channel name (plus ‘blended’), with columns ‘0.05’, ‘0.95’, ‘median’, ‘mean’ representing Cost Per Target statistics.


describe_all_media_spend

def describe_all_media_spend(per_observation_df: pd.DataFrame) -> pd.DataFrame:

Calculates the total cost and percentage share of total cost for each channel based on cost columns in the input DataFrame.

Parameters:

  • per_observation_df (pd.DataFrame): A DataFrame expected to have columns ending in “_cost” representing spend per observation for each channel.

Returns:

  • pd.DataFrame: A DataFrame indexed by channel cost column name, with columns “Total Cost” and “% of total”.


quick_stats

def quick_stats(model) -> pd.DataFrame:

Generates a concise ArviZ summary table for key model parameters (intercept, likelihood_sigma, beta_channel, alpha, lam).

Parameters:

  • model: The fitted MMM instance containing fit_result.

Returns:

  • pd.DataFrame: The ArviZ summary DataFrame including mean, median, sd, and HDI.


get_media_effect_df

def get_media_effect_df(marketing_mix_model) -> pd.DataFrame:

Extracts the mean posterior channel contributions over time and returns them as a DataFrame.

Parameters:

  • marketing_mix_model: The fitted MMM instance.

Returns:

  • pd.DataFrame: DataFrame with ‘date’ as index and columns for each channel’s mean contribution.


get_roi_df

def get_roi_df(model, data: pd.DataFrame, config: dict) -> dict:

Legacy function: calculates only the mean Return on Investment (ROI) per channel. Prefer compute_roi_summary for richer statistics.

Parameters:

  • model: The fitted MMM instance.

  • data (pd.DataFrame): Input data containing channel spend.

  • config (dict): Configuration specifying media spend columns.

Returns:

  • dict: A dictionary where keys are channel names and values are the mean ROI for each.


(Note: This module also contains functions describe_training and describe_prediction which appear to be duplicates or older versions of describe_mmm_training and describe_mmm_prediction. The function weekly_spend_by_channel seems redundant as its functionality is covered within describe_mmm_training. The helper _dump_posterior_metrics appears unused.)