src.core.mmm_base_v2¶

Base class for Media Mix Models (MMM) with In-Graph Scaling.

This is version 2 of the MMM base class that implements scaling within the PyMC model graph to avoid PyTensor compilation cache contamination issues. Based on the architecture used by pymc-marketing.

Module Contents¶

class src.core.mmm_base_v2.BaseDelayedSaturatedMMMv2(date_column: str, channel_columns: List[str], adstock_max_lag: int, model_config: Dict | None = None, sampler_config: Dict | None = None, validate_data: bool = True, control_columns: List[str] | None = None, df_lift_test: pandas.DataFrame | None = None, **kwargs)¶

Bases: src.core.model.MMM

Base class for Media Mix Models with delayed adstock and logistic saturation.

Version 2: Implements in-graph scaling to avoid PyTensor cache contamination.

Key Changes from v1: - Stores raw data instead of preprocessed data - Computes scaling parameters but doesn’t apply them to data - Applies scaling transformations within the PyMC model graph - Eliminates tensor shape contamination issues

This implementation is inspired by pymc-marketing’s approach to handling data transformations within the model graph using pm.Data containers.

channel_scale_mean¶

Mean values for channel scaling.

Type:: pd.Series

channel_scale_std¶

Standard deviation for channel scaling.

Type:: pd.Series

target_scale_mean¶

Mean value for target scaling.

Type:: float

target_scale_std¶

Standard deviation for target scaling.

Type:: float

control_scale_mean¶

Mean values for control scaling.

Type:: pd.Series

control_scale_std¶

Standard deviation for control scaling.

Type:: pd.Series

control_columns¶

List of control variable columns.

Type:: Optional[List[str]]

adstock_max_lag¶

Maximum lag for adstock transformation.

Type:: int

yearly_seasonality¶

Number of Fourier modes for seasonality.

Type:: Optional[int]

date_column¶

Name of the date column.

Type:: str

validate_data¶

Whether to validate input data.

Type:: bool

channel_columns¶

List of media channel columns.

Type:: List[str]

model_config¶

Model configuration.

Type:: Optional[Dict]

sampler_config¶

Sampler configuration.

Type:: Optional[Dict]

property default_sampler_config: Dict¶: Returns the default configuration for the PyMC sampler.

property output_var: str¶: Returns the name of the target variable used in the model.

compute_scaling_params(X: pandas.DataFrame, y: pandas.Series | numpy.ndarray) → None¶

Computes max-abs scaling parameters without applying them to the data.

This method calculates max absolute values for scaling following PyMC-Marketing’s MaxAbsScaler approach. The actual scaling will happen within the PyMC model graph.

Parameters:

X – Input features DataFrame containing channels and controls.
y – Target variable data.

build_model(X: pandas.DataFrame, y: pandas.Series | numpy.ndarray, **kwargs: Any) → None¶

Builds the PyMC model with in-graph scaling.

This is the core change from v1: scaling happens WITHIN the model graph using PyTensor operations rather than preprocessing the data beforehand.

Parameters:

X – Input features DataFrame.
y – Target variable data.
**kwargs – Additional keyword arguments.

classmethod load(fname: str) → BaseDelayedSaturatedMMMv2¶

Loads a saved model instance from a NetCDF file.

Parameters:: fname – File path to load the model from.
Returns:: Loaded model instance with scaling parameters.

property default_model_config: Dict¶: Returns the default model configuration dictionary.

channel_contributions_forward_pass(channel_data: numpy.ndarray) → numpy.ndarray¶

Evaluates channel contributions using fitted model parameters.

Note: This method expects RAW channel data and will apply scaling internally based on the stored scaling parameters.

Parameters:: channel_data – Raw input channel data (not scaled). Shape should be (n_dates, n_channels).
Returns:: Estimated channel contributions based on the fitted model. Shape will be (chains, draws, n_dates, n_channels).