Sketch: Depict (depict.py)¶
High-level orchestrators that generate plots and CSV summaries for training and prediction diagnostics, and helpers to export reproducible artifacts.
Main Reporting Functions¶
These functions generate multiple plots and save summary data to the specified results directory.
describe_mmm_training¶
def describe_mmm_training(
config: dict,
processed_data: pd.DataFrame,
mmm, # Fitted MMM instance
results_dir: str
) -> None:
Generates and saves a suite of plots and summaries related to model fit on the training data.
Generated Outputs (saved in results_dir):
model_summary.csv— ArviZ summary statistics (median, HDI) for key parameters (intercept, likelihood_sigma, beta_channel, alpha, lam).all_decomp.csv— Mean contributions of all components over time, with adatecolumn.media_contribution_mean.png,media_contribution_median.png— Channel contribution stats.weekly_media_and_baseline_contribution.png— Stacked baseline + channels over time.weekly_media_contribution.png— Channels-only contributions over time.model_priors_and_posteriors.png— Trace plots for key parameters.response_curves.png— Direct contribution curves for each channel.ROI/Efficiency visuals:
media_performance_mean.png,media_performance_median.pngperformance_distribution.png
Parameters:
config(dict): The configuration dictionary.processed_data(pd.DataFrame): The processed input data used for fitting.mmm: The fitted MMM instance (e.g.,DelayedSaturatedMMM).results_dir(str): The directory path where output files will be saved.
Returns:
None.
describe_mmm_prediction¶
def describe_mmm_prediction(
config: dict,
input_data_processed: pd.DataFrame,
mmm, # Fitted MMM instance
results_dir: str
) -> None:
Generates and saves plots and summaries related to predictive performance, including optional out-of-sample checks when a train/test split is used.
Generated Outputs (saved in results_dir):
waterfall_plot_components_decomposition.png— Components decomposition waterfall.model_fit_predictions.png— Actuals vs posterior predictive (with HDI fans) and residuals.media_performance_effect.csv— Summary statistics forbeta_channel.ROI/Cost per Target CSVs (depending on
target_type):Revenue target:
media_contribution_per_spend.csv,media_cost_per_revenue_unit.csvConversion target:
media_conversion_efficiency.csv,media_cost_per_conversion.csv
Parameters:
config(dict): The configuration dictionary (used to gettrain_test_ratio).input_data_processed(pd.DataFrame): The full processed input data (used for splitting).mmm: The fitted MMM instance.results_dir(str): The directory path where output files will be saved.
Returns:
None.
describe_input_data¶
def describe_input_data(
input_data, # InputData instance
results_dir: str,
suffix: str
) -> None:
Generates plots and reports describing the raw input data.
Generated Outputs (saved in results_dir):
metrics_{suffix}.png— All input metrics over time (media volumes, costs, extras, target).Outlier report printed/saved by the preprocessing module.
Parameters:
input_data: An instance ofammm.prepro.input_data.InputData.results_dir(str): The directory path where output files will be saved.suffix(str): A suffix (e.g., “raw”) added to output filenames.
Returns:
None.
describe_config¶
def describe_config(output_dir: str, config: str, git_sha: str) -> None:
Saves configuration details to facilitate reproducibility.
Generated Outputs (saved in output_dir):
git_sha.txt: Contains the Git commit hash of the code version used.config.yaml: Contains the raw string content of the configuration file used for the run.
Parameters:
output_dir(str): The directory path where output files will be saved.config(str): The raw string content of the YAML configuration file.git_sha(str): The Git commit hash.
Returns:
None.
Helper/Metric Functions¶
compute_roi_summary¶
def compute_roi_summary(model, data: pd.DataFrame, config: dict) -> pd.DataFrame:
Computes summary statistics (mean, median, 5th/95th percentiles) for the Return on Investment (ROI) for each channel.
Parameters:
model: The fitted MMM instance.data(pd.DataFrame): The input data containing channel spend columns.config(dict): Configuration dictionary specifying media spend columns.
Returns:
pd.DataFrame: DataFrame indexed by channel name (plus ‘blended’), with columns ‘0.05’, ‘0.95’, ‘median’, ‘mean’ representing ROI statistics.
compute_cost_per_target_summary¶
def compute_cost_per_target_summary(model, data: pd.DataFrame, config: dict) -> pd.DataFrame:
Computes summary statistics (mean, median, 5th/95th percentiles) for the Cost Per Target (e.g., cost per acquisition) for each channel. This is calculated as the inverse of ROI.
Parameters:
model: The fitted MMM instance.data(pd.DataFrame): The input data containing channel spend columns.config(dict): Configuration dictionary specifying media spend columns.
Returns:
pd.DataFrame: DataFrame indexed by channel name (plus ‘blended’), with columns ‘0.05’, ‘0.95’, ‘median’, ‘mean’ representing Cost Per Target statistics.
describe_all_media_spend¶
def describe_all_media_spend(per_observation_df: pd.DataFrame) -> pd.DataFrame:
Calculates the total cost and percentage share of total cost for each channel based on cost columns in the input DataFrame.
Parameters:
per_observation_df(pd.DataFrame): A DataFrame expected to have columns ending in “_cost” representing spend per observation for each channel.
Returns:
pd.DataFrame: A DataFrame indexed by channel cost column name, with columns “Total Cost” and “% of total”.
quick_stats¶
def quick_stats(model) -> pd.DataFrame:
Generates a concise ArviZ summary table for key model parameters (intercept, likelihood_sigma, beta_channel, alpha, lam).
Parameters:
model: The fitted MMM instance containingfit_result.
Returns:
pd.DataFrame: The ArviZ summary DataFrame including mean, median, sd, and HDI.
get_media_effect_df¶
def get_media_effect_df(marketing_mix_model) -> pd.DataFrame:
Extracts the mean posterior channel contributions over time and returns them as a DataFrame.
Parameters:
marketing_mix_model: The fitted MMM instance.
Returns:
pd.DataFrame: DataFrame with ‘date’ as index and columns for each channel’s mean contribution.
get_roi_df¶
def get_roi_df(model, data: pd.DataFrame, config: dict) -> dict:
Legacy function: calculates only the mean Return on Investment (ROI) per channel. Prefer compute_roi_summary for richer statistics.
Parameters:
model: The fitted MMM instance.data(pd.DataFrame): Input data containing channel spend.config(dict): Configuration specifying media spend columns.
Returns:
dict: A dictionary where keys are channel names and values are the mean ROI for each.
(Note: This module also contains functions describe_training and describe_prediction which appear to be duplicates or older versions of describe_mmm_training and describe_mmm_prediction. The function weekly_spend_by_channel seems redundant as its functionality is covered within describe_mmm_training. The helper _dump_posterior_metrics appears unused.)