input_validator

Provides functions for validating input data for MMM.

Module Contents

input_validator.check_nans(dataframe: pandas.DataFrame, target_col: str, media_cols: List[str], control_cols: List[str]) None

Checks for NaN values in specified columns of a DataFrame.

Parameters:
  • dataframe – The input DataFrame.

  • target_col – The name of the target variable column.

  • media_cols – A list of media channel column names.

  • control_cols – A list of control variable column names.

Raises:

ValueError – If NaN values are found in any of the specified columns.

input_validator.check_duplicate_columns(dataframe: pandas.DataFrame) None

Checks for duplicate column names in a DataFrame.

Parameters:

dataframe – The input DataFrame.

Raises:

ValueError – If duplicate column names are found.

input_validator.check_date_column(date_series: pandas.Series, config: Dict[str, Any]) None

Validates the date column for sorting, frequency, gaps, and weekly start day.

Parameters:
  • date_series – The Series containing date/time information.

  • config – The configuration dictionary (currently unused, placeholder).

Raises:

ValueError – If dates are not sorted, frequency cannot be inferred, gaps are detected, or weekly dates don’t start consistently.

input_validator.check_column_variance(dataframe: pandas.DataFrame, columns: List[str], check_zeros_only: bool = False) None

Checks specified numeric columns for zero variance or being all zeros.

Parameters:
  • dataframe – The input DataFrame.

  • columns – A list of column names to check.

  • check_zeros_only – If True, only checks if all values are zero. If False, checks for any zero variance (constant value). Defaults to False.

Raises:

ValueError – If any specified column has zero variance or consists only of zeros.