src.llm_interpreter.inputs.schema_map¶
Schema definitions for AMMM CSV outputs.
This module defines typed dataclasses for all 12 CSV files generated by the AMMM pipeline. Each dataclass represents a row in its respective CSV file with proper type annotations.
Author: AMMM Team Created: 2025-04-10 Last Modified: 2025-04-10
Module Contents¶
- class src.llm_interpreter.inputs.schema_map.StationarityRow¶
Represents a row from stationarity_summary.csv.
Tests for stationarity in time series data using both ADF and KPSS tests. Generated during Phase 4 - Data Exploration (Pre-diagnostics).
- is_problematic() bool¶
Check if variable shows non-stationarity issues.
- class src.llm_interpreter.inputs.schema_map.VIFRow¶
Represents a row from vif_summary.csv.
Variance Inflation Factor (VIF) analysis to detect multicollinearity between features. Generated during Phase 4 - Data Exploration (Pre-diagnostics).
Interpretation: - VIF = 1: No correlation - VIF < 5: Low correlation (acceptable) - VIF 5-10: Moderate correlation (caution) - VIF > 10: High multicollinearity (problematic)
- severity_level() Literal['none', 'low', 'moderate', 'high']¶
Classify VIF severity level.
- is_multicollinear() bool¶
Check if variable is flagged for high VIF.
- class src.llm_interpreter.inputs.schema_map.TransferEntropyRow¶
Represents a row from transfer_entropy_summary.csv.
Measures bidirectional information transfer between variables using transfer entropy. Generated during Phase 4 - Data Exploration (Pre-diagnostics).
- has_significant_transfer() bool¶
Check if there’s significant information transfer in either direction.
- class src.llm_interpreter.inputs.schema_map.ModelSummaryRow¶
Represents a row from model_summary.csv.
Detailed summary of all fitted model parameters with posterior statistics. Generated during Phase 5 - Model Fitting (After MCMC sampling).
Parameter Types: - intercept: Model intercept (baseline effect) - likelihood_sigma: Noise/error standard deviation - beta_channel[channel_name]: Channel effectiveness coefficient - alpha[channel_name]: Adstock retention parameter (0-1) - lam[channel_name]: Saturation steepness parameter
- has_converged(threshold: float = 1.01) bool¶
Check if parameter has converged (r_hat ≈ 1.0).
- get_parameter_type() Literal['intercept', 'sigma', 'beta', 'alpha', 'lam', 'unknown']¶
Extract parameter type from parameter name.
- class src.llm_interpreter.inputs.schema_map.ELPDRow¶
Represents a row from ELPD_summary.csv.
Expected Log Pointwise Predictive Density (ELPD) and model diagnostics. Generated during Phase 7 - Post-Analysis (Model Diagnostics).
Metrics Included: - n_samples: Number of posterior samples used - n_data_points: Number of data points in the model - good_k: Proportion of good Pareto k values (should be > 0.7) - elpd_loo: Expected log pointwise predictive density (LOO-CV) - p_loo: Effective number of parameters - warning: Whether LOO diagnostic warnings were raised - r_squared: Model R-squared value
- as_float() float | None¶
Safely convert value to float if numeric.
- as_bool() bool | None¶
Safely convert value to bool if boolean.
- class src.llm_interpreter.inputs.schema_map.MediaPerformanceEffectRow¶
Represents a row from media_performance_effect.csv.
Media channel effectiveness with posterior statistics from Bayesian model. Generated during Phase 7 - Post-Analysis (Performance Calculation).
- has_converged(threshold: float = 1.01) bool¶
Check if parameter has converged (r_hat ≈ 1.0).
- class src.llm_interpreter.inputs.schema_map.MediaConversionEfficiencyRow¶
[LEGACY V1] Represents a row from media_conversion_efficiency.csv.
Note: This file is not generated in V2. Kept for backward compatibility.
- class src.llm_interpreter.inputs.schema_map.MediaCostPerConversionRow¶
[LEGACY V1] Represents a row from media_cost_per_conversion.csv.
Note: This file is not generated in V2. Kept for backward compatibility.
- class src.llm_interpreter.inputs.schema_map.MediaContributionPerSpendRow¶
Represents a row from media_contribution_per_spend.csv (V2).
Media channel ROI/contribution per spend with percentiles. Generated during Phase 7 - Post-Analysis (Performance Calculation).
- class src.llm_interpreter.inputs.schema_map.MediaCostPerRevenueUnitRow¶
Represents a row from media_cost_per_revenue_unit.csv (V2).
Cost per revenue unit metrics with percentiles for each media channel. Generated during Phase 7 - Post-Analysis (Performance Calculation).
- class src.llm_interpreter.inputs.schema_map.ResponseCurveFitRow¶
Represents a row from response_curve_fit_combined.csv.
Fitted response curves for all media channels showing diminishing returns. Generated during Phase 5 - Model Fitting (Visualization Phase).
- class src.llm_interpreter.inputs.schema_map.BudgetScenarioResultRow¶
Represents a row from budget_scenario_results.csv.
Results from budget scenario planning across different budget levels. Generated during Phase 8 - Budget Optimization.
Scenario Types: - baseline: Current spend levels - scenario-X: X% decrease in total budget - scenario_+X: X% increase in total budget
- is_baseline() bool¶
Check if this is the baseline scenario.
- is_total_row() bool¶
Check if this is a total/aggregate row.
- class src.llm_interpreter.inputs.schema_map.AllDecompRow¶
Represents a row from all_decomp.csv.
Time-series decomposition showing contribution of each channel and control variable. Generated during Phase 7 - Post-Analysis (Decomposition).
Note: This CSV has dynamic columns for each media channel and control variable. The structure is: date, [channel_contributions], [control_contributions], trend, intercept
- get_channel_contribution(channel: str) float¶
Get contribution for a specific channel.
- get_total_media_contribution(media_channels: list[str]) float¶
Calculate total contribution from specified media channels.
- class src.llm_interpreter.inputs.schema_map.WaterfallDecompositionRow¶
Represents a row from waterfall_decomposition_data.csv.
Aggregated decomposition data for waterfall visualizations. Generated during Phase 7 - Post-Analysis (Visualization).
- is_positive_contributor() bool¶
Check if component has positive contribution.
- class src.llm_interpreter.inputs.schema_map.CSVSummary¶
Container for a single CSV file’s data and metadata.
- is_empty() bool¶
Check if CSV has no data.
- class src.llm_interpreter.inputs.schema_map.AllCSVData¶
Container for all CSV outputs from AMMM pipeline.
Each attribute holds a CSVSummary object containing the parsed data from the respective CSV file.
- classmethod from_dict(csv_dict: dict[str, list]) AllCSVData¶
Create AllCSVData from a dictionary of CSV data.
- Parameters:
csv_dict – Dictionary mapping CSV names to lists of dataclass instances
- Returns:
AllCSVData object with CSVSummary attributes
- get_available_files() list[str]¶
Get list of CSV files that were successfully loaded.
- count_loaded_files() int¶
Count how many CSV files were loaded.
- src.llm_interpreter.inputs.schema_map.get_schema_class(csv_name: str)¶
Get the schema dataclass for a given CSV file name.
- Parameters:
csv_name – Name of CSV file (with or without .csv extension)
- Returns:
Dataclass type for the schema
- Raises:
KeyError – If CSV name not found in schema map
- src.llm_interpreter.inputs.schema_map.get_column_mapping(csv_name: str) dict[str, str]¶
Get column name mappings for a CSV file.
- Parameters:
csv_name – Name of CSV file (with or without .csv extension)
- Returns:
Dictionary mapping CSV column names to dataclass field names