src.llm_interpreter.inputs.schema_map¶

Schema definitions for AMMM CSV outputs.

This module defines typed dataclasses for all 12 CSV files generated by the AMMM pipeline. Each dataclass represents a row in its respective CSV file with proper type annotations.

Author: AMMM Team Created: 2025-04-10 Last Modified: 2025-04-10

Module Contents¶

class src.llm_interpreter.inputs.schema_map.StationarityRow¶

Represents a row from stationarity_summary.csv.

Tests for stationarity in time series data using both ADF and KPSS tests. Generated during Phase 4 - Data Exploration (Pre-diagnostics).

is_problematic() → bool¶: Check if variable shows non-stationarity issues.

class src.llm_interpreter.inputs.schema_map.VIFRow¶

Represents a row from vif_summary.csv.

Variance Inflation Factor (VIF) analysis to detect multicollinearity between features. Generated during Phase 4 - Data Exploration (Pre-diagnostics).

Interpretation: - VIF = 1: No correlation - VIF < 5: Low correlation (acceptable) - VIF 5-10: Moderate correlation (caution) - VIF > 10: High multicollinearity (problematic)

severity_level() → Literal['none', 'low', 'moderate', 'high']¶: Classify VIF severity level.

is_multicollinear() → bool¶: Check if variable is flagged for high VIF.

class src.llm_interpreter.inputs.schema_map.TransferEntropyRow¶

Represents a row from transfer_entropy_summary.csv.

Measures bidirectional information transfer between variables using transfer entropy. Generated during Phase 4 - Data Exploration (Pre-diagnostics).

has_significant_transfer() → bool¶: Check if there’s significant information transfer in either direction.

class src.llm_interpreter.inputs.schema_map.ModelSummaryRow¶

Represents a row from model_summary.csv.

Detailed summary of all fitted model parameters with posterior statistics. Generated during Phase 5 - Model Fitting (After MCMC sampling).

Parameter Types: - intercept: Model intercept (baseline effect) - likelihood_sigma: Noise/error standard deviation - beta_channel[channel_name]: Channel effectiveness coefficient - alpha[channel_name]: Adstock retention parameter (0-1) - lam[channel_name]: Saturation steepness parameter

has_converged(threshold: float = 1.01) → bool¶: Check if parameter has converged (r_hat ≈ 1.0).

get_parameter_type() → Literal['intercept', 'sigma', 'beta', 'alpha', 'lam', 'unknown']¶: Extract parameter type from parameter name.

class src.llm_interpreter.inputs.schema_map.ELPDRow¶

Represents a row from ELPD_summary.csv.

Expected Log Pointwise Predictive Density (ELPD) and model diagnostics. Generated during Phase 7 - Post-Analysis (Model Diagnostics).

Metrics Included: - n_samples: Number of posterior samples used - n_data_points: Number of data points in the model - good_k: Proportion of good Pareto k values (should be > 0.7) - elpd_loo: Expected log pointwise predictive density (LOO-CV) - p_loo: Effective number of parameters - warning: Whether LOO diagnostic warnings were raised - r_squared: Model R-squared value

as_float() → float | None¶: Safely convert value to float if numeric.

as_bool() → bool | None¶: Safely convert value to bool if boolean.

class src.llm_interpreter.inputs.schema_map.MediaPerformanceEffectRow¶

Represents a row from media_performance_effect.csv.

Media channel effectiveness with posterior statistics from Bayesian model. Generated during Phase 7 - Post-Analysis (Performance Calculation).

has_converged(threshold: float = 1.01) → bool¶: Check if parameter has converged (r_hat ≈ 1.0).

class src.llm_interpreter.inputs.schema_map.MediaConversionEfficiencyRow¶

[LEGACY V1] Represents a row from media_conversion_efficiency.csv.

Note: This file is not generated in V2. Kept for backward compatibility.

class src.llm_interpreter.inputs.schema_map.MediaCostPerConversionRow¶

[LEGACY V1] Represents a row from media_cost_per_conversion.csv.

Note: This file is not generated in V2. Kept for backward compatibility.

class src.llm_interpreter.inputs.schema_map.MediaContributionPerSpendRow¶

Represents a row from media_contribution_per_spend.csv (V2).

Media channel ROI/contribution per spend with percentiles. Generated during Phase 7 - Post-Analysis (Performance Calculation).

class src.llm_interpreter.inputs.schema_map.MediaCostPerRevenueUnitRow¶

Represents a row from media_cost_per_revenue_unit.csv (V2).

Cost per revenue unit metrics with percentiles for each media channel. Generated during Phase 7 - Post-Analysis (Performance Calculation).

class src.llm_interpreter.inputs.schema_map.ResponseCurveFitRow¶

Represents a row from response_curve_fit_combined.csv.

Fitted response curves for all media channels showing diminishing returns. Generated during Phase 5 - Model Fitting (Visualization Phase).

class src.llm_interpreter.inputs.schema_map.BudgetScenarioResultRow¶

Represents a row from budget_scenario_results.csv.

Results from budget scenario planning across different budget levels. Generated during Phase 8 - Budget Optimization.

Scenario Types: - baseline: Current spend levels - scenario-X: X% decrease in total budget - scenario_+X: X% increase in total budget

is_baseline() → bool¶: Check if this is the baseline scenario.

is_total_row() → bool¶: Check if this is a total/aggregate row.

class src.llm_interpreter.inputs.schema_map.AllDecompRow¶

Represents a row from all_decomp.csv.

Time-series decomposition showing contribution of each channel and control variable. Generated during Phase 7 - Post-Analysis (Decomposition).

Note: This CSV has dynamic columns for each media channel and control variable. The structure is: date, [channel_contributions], [control_contributions], trend, intercept

get_channel_contribution(channel: str) → float¶: Get contribution for a specific channel.

get_total_media_contribution(media_channels: list[str]) → float¶: Calculate total contribution from specified media channels.

class src.llm_interpreter.inputs.schema_map.WaterfallDecompositionRow¶

Represents a row from waterfall_decomposition_data.csv.

Aggregated decomposition data for waterfall visualizations. Generated during Phase 7 - Post-Analysis (Visualization).

is_positive_contributor() → bool¶: Check if component has positive contribution.

class src.llm_interpreter.inputs.schema_map.CSVSummary¶

Container for a single CSV file’s data and metadata.

is_empty() → bool¶: Check if CSV has no data.

class src.llm_interpreter.inputs.schema_map.AllCSVData¶

Container for all CSV outputs from AMMM pipeline.

Each attribute holds a CSVSummary object containing the parsed data from the respective CSV file.

classmethod from_dict(csv_dict: dict[str, list]) → AllCSVData¶

Create AllCSVData from a dictionary of CSV data.

Parameters:: csv_dict – Dictionary mapping CSV names to lists of dataclass instances
Returns:: AllCSVData object with CSVSummary attributes

get_available_files() → list[str]¶: Get list of CSV files that were successfully loaded.

count_loaded_files() → int¶: Count how many CSV files were loaded.

src.llm_interpreter.inputs.schema_map.get_schema_class(csv_name: str)¶

Get the schema dataclass for a given CSV file name.

Parameters:: csv_name – Name of CSV file (with or without .csv extension)
Returns:: Dataclass type for the schema
Raises:: KeyError – If CSV name not found in schema map

src.llm_interpreter.inputs.schema_map.get_column_mapping(csv_name: str) → dict[str, str]¶

Get column name mappings for a CSV file.

Parameters:: csv_name – Name of CSV file (with or without .csv extension)
Returns:: Dictionary mapping CSV column names to dataclass field names