Results File Organization

This document outlines the implementation of a structured file organization system for the results directory.

Overview

The project now organizes output files in the /results directory into subdirectories based on file type:

  • /results/csv/: All CSV data files

  • /results/png/: All PNG image and plot files

  • /results/json/: All JSON configuration and data files

  • /results/other/: All other file types (NC, TXT, PKL, etc.)

Implementation Steps

The implementation consists of several components:

  1. File utility module - Created at src/utils/file_utils.py

  2. Directory structure - Subdirectories created in the results folder

  3. Integration strategy - How to update existing code

Using the File Utilities

The file utils module provides helper functions to automatically save files to the appropriate subdirectory:

from src.utils.file_utils import save_csv, save_figure, save_json, save_text

# Save a DataFrame to results/csv/
save_csv(df, "analysis_data.csv")

# Save a matplotlib figure to results/png/
save_figure(fig, "analysis_plot.png")

# Save a dictionary as JSON to results/json/
save_json(data, "model_params.json")

# Save text content to results/other/
save_text(text, "summary.txt")

Integration Strategy

To integrate the file organization system throughout the codebase, the following changes should be made:

1. Minimal Changes Approach

This approach involves minimal changes to existing code, only updating the file path handling:

# Before:
df.to_csv(os.path.join(results_dir, "data.csv"))

# After:
from src.utils.file_utils import save_csv
save_csv(df, "data.csv", results_dir)

2. Detailed Refactoring

For a more thorough integration, replace all file operations with the utility functions:

In Matplotlib plot saving:

# Before:
plt.savefig(os.path.join(results_dir, 'plot.png'))

# After:
from src.utils.file_utils import save_figure
save_figure(plt.gcf(), 'plot.png', results_dir)

In pandas DataFrame saving:

# Before:
df.to_csv(os.path.join(results_dir, 'data.csv'))

# After:
from src.utils.file_utils import save_csv
save_csv(df, 'data.csv', results_dir)

Files to Update

Based on codebase analysis, the following files contain code that saves to the results directory and should use the utilities above:

  1. src/sketch/depict.py: Contains numerous plot and CSV saving operations

  2. src/sketch/plot_results.py: Contains functions that save plots

  3. demo/runme.py: Contains result output operations

  4. src/driver/opt.py: Saves scenario results CSV

Organizing Existing Results (optional)

If you have legacy runs where outputs were written directly under results/, you can run a quick one-off Python snippet to organize files into subfolders by extension:

from pathlib import Path

results_dir = Path('results')
mapping = {
    '.csv': 'csv',
    '.png': 'png',
    '.json': 'json',
}

for p in results_dir.glob('*'):
    if not p.is_file():
        continue
    ext = p.suffix.lower()
    sub = mapping.get(ext, 'other')
    target = results_dir / sub / p.name
    target.parent.mkdir(parents=True, exist_ok=True)
    p.rename(target)

Best Practices

  1. Always use the utility functions instead of direct file operations

  2. Keep the directory structure consistent

  3. When refactoring existing code, ensure backward compatibility

  4. Include appropriate imports at the top of the file

Future Enhancements

Potential future enhancements to this system could include:

  1. Adding date-based subdirectories for result versioning

  2. Implementing a naming convention enforcer

  3. Adding automatic metadata tracking for all saved files

  4. Creating a results browser tool