Skip to content

Proclus01/multi-forecaster

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

README

Open‑source, modular forecasting core centered on the nonlinear mapping B' = F(B): take a canonical time‑series state and produce an ensemble forecast with confidence intervals, storage, evaluation, and clear artifacts.

Updated: September 11, 2025


Contents

  • Features
  • Requirements
  • Install
  • Repo layout
  • Data placement (three options)
  • Quickstart (programmatic)
  • Batch example (multiple locations)
  • Configuration
  • Models included
  • Ensembling & Champion–Challenger
  • Confidence intervals
  • Early‑FY missing months (fallback vs backcast)
  • Outputs & artifacts
  • Evaluation & retro metrics
  • CLI reference
  • Caching and performance
  • Troubleshooting
  • Extending the system
  • Security note
  • FAQ
  • License

Features

  • Canonical core: SeriesState and ForecastResult unify the pipeline and artifacts.
  • Adapters: convert business‑specific data into canonical shapes (e.g., LocationDataAdapter).
  • Model zoo: Seasonal Naive, Monte Carlo (seasonal draws), ARIMA (SARIMAX), Holt‑Winters, Prophet*, NeuralProphet*.
  • Robust ensembling: simple mean, inverse‑error weights, ridge stacking (with optional non‑negative NNLS), random trees stacking, online inverse‑error.
  • Champion–Challenger with true holdout:
    • Hold out the last L months (“tail”) from training each aggregator candidate; evaluate strictly on the held‑out tail.
    • Overfit guard (ratio/absolute gap), complexity penalty, stability tie‑breaker.
  • Confidence intervals (CIs):
    • Residual bootstrap with month conditioning, additive or multiplicative semantics.
    • Model‑native bands as fallback, with safe ±p% band for models lacking quantiles.
  • Bounded caches with clear_cache():
    • ARIMA/Holt‑Winters parameter selection and NeuralProphet fitted models use bounded LRU caches (configurable size).
  • Diagnostics:
    • Imputation diagnostics for stackers (Ridge/RandomTrees) at inference: % cells imputed, % rows fully imputed, missing columns added, and rows that still lack signal after imputation.
    • Online inverse‑error exposes current weights (coefficients) and supports warm‑start from store.
  • Interpretability helper:
    • RidgeStackingFunctional.normalized_weights("abs"|"softmax") for audit‑friendly, normalized, non‑negative views of coefficients.
  • Artifacts:
    • Per‑run CSV/PNG and optional Excel workbook export; notes include aggregation mode, champion, leaderboard.
  • Full‑year roll‑up:
    • 12‑month frames merge known actuals with forecasts; optional control over how per‑model columns are populated for months with actuals.

* Prophet and NeuralProphet are optional; if not installed, the pipeline degrades gracefully with warnings.


Requirements

  • Python: 3.9–3.12 recommended
  • Core packages: pandas, numpy, scikit‑learn, matplotlib, openpyxl, statsmodels, joblib
  • Optional:
    • prophet (or prophet[torch] per platform), neuralprophet, torch
    • scipy (enables NNLS non‑negative stacking)
  • Platforms: macOS & Windows supported. For Prophet/NeuralProphet prerequisites, see their install docs.

Install

# from repo root
python -m venv .venv
source .venv/bin/activate      # Windows: .venv\Scripts\activate
pip install -U pip wheel

# core deps
pip install -r requirements.txt    # if provided
# or install directly:
pip install pandas numpy scikit-learn matplotlib openpyxl statsmodels joblib

# optional models (safe to skip)
pip install prophet
pip install neuralprophet torch --index-url https://download.pytorch.org/whl/cpu  # example CPU build

# optional for non-negative stacking
pip install scipy

If Prophet/NeuralProphet fail to install, you can still run — those models are skipped with warnings.


Repo layout

app/
  __init__.py
  adapters/
    location_adapter.py
  core/
    aggregation.py
    ci.py
    metrics.py
    online_weights.py
  evaluation/
    evaluator.py
  io/
    exporters.py
  models/
    __init__.py
    base.py
    controller.py
    discovery.py
    registry.py
    src/
      __init__.py
      arima.py
      holt_winters.py
      monte_carlo.py
      neuralprophet_model.py
      prophet_model.py
      seasonal_naive.py
  selection/
    aggregator_select.py
  pipeline.py
main.py
tests/

Data placement (three options)

You can keep business data outside the core or co‑locate it. The adapter expects a top‑level variable named location_data with this shape:

LocationData = Dict[str, Dict[str, Dict[int, Dict[int, float]]]]
# location -> {"Actual": {year: {month: value}}, "Budget": {year: {month: value}}}

Option A (recommended): external module (clean separation)

your-repo/
  client_pkg/
    __init__.py
    my_data.py          # defines `location_data = {...}`
  app/

Load with import client_pkg.my_data.

Option B: extension path (co‑located but outside core)

your-repo/
  extensions/
    datasets/
      __init__.py
      location_data.py  # defines `location_data = {...}`
  app/

Import extensions.datasets.location_data.

Option C (legacy / quickstart): app/data.py

Use the included data.py at the repo root or move it to app/data.py and import app.data.

Note: Ensure the parent folder is importable (has init.py or is on PYTHONPATH).


Quickstart (programmatic)

Minimal end‑to‑end example using the included data.py (Option C). If your data lives elsewhere, change the import.

# examples/run_one.py
from dataclasses import dataclass
import pandas as pd

from app.adapters.location_adapter import LocationDataAdapter, LocationDataAdapterConfig
from app.pipeline import ForecastPipeline, ForecastConfig
from app.domain import SeriesState
from data import location_data  # Option C

def build_state(location: str, forecast_year: int) -> SeriesState:
    adapter = LocationDataAdapter(LocationDataAdapterConfig(
        forecast_year=forecast_year,
        treat_zero_as_missing_in_forecast_year=True,
    ))
    return adapter.adapt(location_data, location)

def main():
    location = "test_location_name"
    year = 2025

    cfg = ForecastConfig(
        forecast_year=year,
        models=["SeasonalNaive","MonteCarlo","ARIMA","HoltWinters"],
        aggregator_mode="AUTO_CC",
        validation_tail_months=4,
        show_progress=True,
        write_excel_results=True,
    )
    pipe = ForecastPipeline(cfg)

    state = build_state(location, year)
    result = pipe.run(state, save_dir="forecasts")

    annual, lo, hi = result.annual_summary()
    print(f"{location} {year} — Annual forecast: ${annual:,.2f} (CI: ${lo:,.2f} – ${hi:,.2f})")

if __name__ == "__main__":
    main()

Run it:

python -m examples.run_one

Batch example (multiple locations)

# examples/run_batch.py
from typing import Iterable
from app.adapters.location_adapter import LocationDataAdapter, LocationDataAdapterConfig
from app.pipeline import ForecastPipeline, ForecastConfig
from data import location_data

def forecast_locations(locations: Iterable[str], year: int = 2025):
    pipe = ForecastPipeline(ForecastConfig(
        forecast_year=year,
        models=["SeasonalNaive","MonteCarlo","ARIMA","HoltWinters"],
        aggregator_mode="AUTO_CC",
        show_progress=True,
    ))
    adapter = LocationDataAdapter(LocationDataAdapterConfig(forecast_year=year))
    for loc in locations:
        state = adapter.adapt(location_data, loc)
        res = pipe.run(state, save_dir="forecasts")
        annual, lo, hi = res.annual_summary()
        print(f"[{loc}] {year}: ${annual:,.0f} (CI: ${lo:,.0f} – ${hi:,.0f})")

if __name__ == "__main__":
    forecast_locations(location_data.keys(), year=2025)

Configuration

ForecastConfig controls runtime, models, ensembling, validation, CIs, and outputs.

Key fields (selected):

  • forecast_year: target year (defaults to current UTC year).
  • models: list of model names to run.
  • aggregator_mode: "FIXED" or "AUTO_CC".
  • aggregator: which functional to use when FIXED.
  • weighting_metric: "SMAPE", "MAPE", or "COMPOSITE".
  • validation_tail_months: tail length L for holdout in champion–challenger mode.
  • prefer_bootstrap_ci: prefer residual bootstrap over model bands.
  • residual_ci_conditioning: month‑conditioned residual pools.
  • residual_ci_error_mode: "additive" or "multiplicative".
  • residual_ci_similar_month_pool: "none" or "quarter" pooling.
  • ci_lower_q / ci_upper_q: lower/upper quantiles for CIs.
  • write_full_year_output: include 12 months in artifacts (actuals + forecasts).
  • write_model_actuals_in_full_year: per‑model columns in full_year_df:
    • True (default): fill with actuals where known.
    • False: leave blank/NaN to preserve model identities.
  • Bounded caches:
    • arima_cache_maxsize (default 256)
    • hw_cache_maxsize (default 256)
    • np_cache_maxsize (default 64)

Example:

from app.pipeline import ForecastConfig

cfg = ForecastConfig(
    forecast_year=2025,
    random_state=42,
    models=["SeasonalNaive","MonteCarlo","ARIMA","Prophet","NeuralProphet","HoltWinters"],
    weighting_metric="SMAPE",
    aggregator_mode="AUTO_CC",
    aggregator="BLEND",
    blend_ratio=0.5,
    ridge_alphas=[0.1, 1.0, 10.0],
    rf_n_estimators=200,

    aggregator_candidates=[
      {"name": "BLEND", "params": {"alpha": 0.5}},
      {"name": "INVERSE_ERROR", "params": {}},
      {"name": "SIMPLE_MEAN", "params": {}},
      {"name": "RIDGE_STACKING", "params": {"alphas": [0.1, 1.0, 10.0]}},
      {"name": "RANDOM_TREES_STACKING", "params": {"n_estimators": 200}},
      {"name": "ONLINE_INVERSE_ERROR", "params": {"decay": 0.8}},
    ],
    validation_tail_months=4,
    overfit_ratio_tol=0.35,
    overfit_abs_tol=0.05,
    overfit_min_tail_months=3,
    complexity_penalty=0.02,
    top_k_challengers=2,

    prefer_bootstrap_ci=True,
    residual_ci_conditioning=True,
    residual_ci_error_mode="additive",   # or "multiplicative"
    residual_ci_similar_month_pool="none",
    ci_lower_q=0.05,
    ci_upper_q=0.95,
    mc_simulations=5000,

    arima_p=range(0,3), arima_d=range(0,3), arima_q=range(0,3),
    hw_trend_options=["add","mul",None],
    hw_seasonal_options=["add","mul",None],

    np_quantiles=[0.05,0.5,0.95,0.99],
    np_fallback_ci_pct=0.20,

    parallel_jobs=1,
    show_progress=True,
    progress_width=40,

    write_excel_results=True,
    write_full_year_output=False,
    write_model_actuals_in_full_year=True,
)

Models included

All models implement ForecastModel (fit/predict on monthly ds index):

  • SeasonalNaive — y_{t+h} = y_{t+h−12} with safe month‑of‑year fallback; CI bands degenerate to point.
  • MonteCarlo — Gaussian month‑wise draws from historical μ/σ; quantile CIs.
  • ARIMA (SARIMAX) — AIC grid with adaptive downsizing for short histories; bounded LRU cache for parameter tuples; conf_int bands.
  • HoltWinters — AIC across (trend, seasonal) combos; bounded LRU cache; normal‑approximation bands.
  • Prophet (optional) — native quantile bands.
  • NeuralProphet (optional) — sandboxed LR‑finder; bounded LRU model cache; robust column extraction and fallbacks.

Discovery is automatic: app.models.init walks app.models.src and registers any class decorated with @register_model().


Ensembling & Champion–Challenger

Aggregation functionals (app/core/aggregation.py):

  • SimpleMeanFunctional
  • InverseErrorWeightsFunctional (uses per‑model OOF error)
  • BlendFunctional (convex blend of two functionals; default inverse‑error + mean)
  • RidgeStackingFunctional (linear stacking via RidgeCV; optional non‑negative NNLS)
  • RandomTreesStackingFunctional (nonlinear stacking via RandomForest)
  • OnlineInverseErrorWeightsFunctional (streaming re‑weighting with exponential decay)

Modes:

  • FIXED: use cfg.aggregator.
  • AUTO_CC: run Champion–Challenger (selection/aggregator_select.py):
    • Train each candidate on pre‑tail rows only.
    • Evaluate tail SMAPE on true holdout; log OOF SMAPE (reference), generalization gap (tail − pre‑tail), complexity penalty, and stability tie‑breaker (tail residual variance).
    • Enforce optional min model coverage in the tail; candidates can be disqualified with reason.

Confidence intervals

Residual bootstrap (preferred when residuals exist):

  • Additive mode (default):
    • residuals e = y − ŷ; apply y_draw = point + e.
    • Good when residual scale is roughly constant in absolute units (homoskedastic in level).
  • Multiplicative mode:
    • residuals are relative shocks r = (y − ŷ) / |y| computed during training.
    • Apply y_draw = point × (1 + r).
    • Bands scale naturally with the level and collapse to (0,0) when the point forecast is 0.
    • Often preferred for non‑negative business series where uncertainty grows with level.

Conditioning and pooling:

  • Month‑of‑year conditioning and optional quarter pooling increase effective residual pool size.
  • Minimum band floor (min_band_pct) enforces a conservative half‑width when pools are tiny; in multiplicative mode the floor is a percentage of the point forecast, so it is 0 when point=0.

Model bands aggregation (fallback):

  • If insufficient residuals, aggregate per‑model lower/upper bands using the same functional.
  • Models without native bands use a symmetric ±p% around their point forecasts.

Early‑FY missing months (fallback vs backcast)

Behavior

  • When early months in the forecast year are missing but later months have actuals, the system does not “backcast” using models. Instead, those early months are filled using a conservative month‑of‑year fallback:
    • Aggregate across training years (mean/median per config) for the same calendar month.
    • If unavailable, progressively fall back to the most recent available year for that month, then 0 as a last resort to avoid NaNs.

Visibility

  • The pipeline emits an INFO log when early‑FY missing months precede the last observed month: Backcast disabled: early‑FY missing months detected (...) precede last observed YYYY‑MM‑01; these months will be filled via month‑of‑year fallback, not backcast modeling.

Why this design

  • It avoids training special backcast models that add complexity and noise for a limited benefit in many business scenarios.
  • It keeps semantics stable: forecasts begin after the last observed month; earlier gaps are treated as data quality issues with deterministic, auditable fills.

Outputs & artifacts

When you run ForecastPipeline.run(state, save_dir="forecasts"), the pipeline writes:

forecasts/
  <series_id>/
    <year>/
      <run_id>/
        forecast.csv         # per-month ensemble + per-model + CI
        forecast_plot.png    # ensemble band & per-model lines
        meta.json            # config, dataset fingerprint, year, etc.
        state.pkl            # ensemble state (functional, residual pools, champion)
        oof_summary.json     # per-model OOF metrics
        retro_metrics.json   # (written for the previous run if actuals advanced)

Excel workbook (if write_excel_results=True):

forecasts/<year>_results.xlsx
  • One worksheet per series (deduped if name exceeds Excel’s 31‑char limit)
  • Columns: Month, Ensemble, High/Low midpoints, Low/High bounds, and per‑model columns, plus an embedded plot and annual summary.
  • Programmatic writer options (ExcelResultsWriter):
    • write_blank_for_nan=True to render empty cells instead of “nan”.
    • include_month_name=True to add a MonthName column.

Full‑year roll‑up:

  • full_year_df/full_year_lo/full_year_hi in ForecastResult combine 12 months (actuals + forecasts).
  • write_model_actuals_in_full_year controls whether per‑model columns are filled with actuals for known months or left blank.

meta.json includes configuration (including residual_ci_error_mode), an imputation diagnostics summary (when applicable), and a compact selection summary (champion, tail SMAPE, penalized score).


Evaluation & retro metrics

  • Current run overlap (Evaluator.compute_current_run_metrics):
    • If some months in the forecast year already have actuals, compute MAPE, SMAPE, and RMSE on the overlap.
    • MAPE is robust and ignores zero‑actual positions.
  • Retro evaluation (Evaluator.retro_evaluate_previous):
    • Compares the previous run’s predictions against newly available actuals; writes retro_metrics.json and appends to a metrics history table.

CLI reference

Run:

python -m main csv <path/to.csv> --year 2025 --id SeriesA --save-dir forecasts

Common flags:

  • --models override model list.
  • --aggregator and --aggregator-mode (FIXED/AUTO_CC) via config; for CSV subcommand use --aggregator for FIXED mode.
  • --no-excel disable Excel workbook output.
  • --full-year-output write 12 months (actuals + forecasts) to artifacts.
  • --full-year-model-fills {actual,blank,model} how to fill per‑model columns in full‑year outputs for months with actuals:
    • actual: per‑model = actuals.
    • blank: leave blank/NaN to preserve model outputs for forecast months only.
    • model: accepted for forward compatibility; currently behaves like blank. The CLI prints a one‑time notice when used.
  • --ci-mode {additive,multiplicative} choose CI residual semantics:
    • additive: absolute residuals; good if residual scale is stable in level units.
    • multiplicative: relative shocks; bands scale with level and collapse to 0 when point=0 (typical for non‑negative business series).
  • --save-dir base directory for artifacts (use '' to disable saving).

Note: --calibrate-from is reserved for future use (pre‑seeding residual pools); not active in this release.


Caching and performance

  • ARIMA/Holt‑Winters parameter selection caches (bounded LRU):
    • Class methods: set_cache_maxsize(maxsize: int), clear_cache().
    • Config keys: arima_cache_maxsize, hw_cache_maxsize.
  • NeuralProphet fitted model cache (bounded LRU):
    • Class methods: set_cache_maxsize(maxsize: int), clear_cache().
    • Config key: np_cache_maxsize.
  • Stackers (Ridge/RandomTrees) log imputation stats at DEBUG; escalate to INFO when more than 20% cells or 5% rows are imputed—or when any rows still lack signal after imputation. These stats are also saved in meta/state for audit.
  • Residual bootstrap is vectorized; increase mc_simulations for smoother quantiles at higher runtime cost.

Troubleshooting

  • KeyError: Unknown location: X — The location isn’t in your location_data dict; check spelling/case.
  • ValueError: NO_ACTUAL_DATA — Location exists but has no "Actual" data; provide at least one year.
  • Zeros in current forecast year — By default, zeros in forecast_year actuals are treated as missing (adapter flag treat_zero_as_missing_in_forecast_year=True).
  • Prophet / NeuralProphet warnings — Optional; if install is hard on your platform, keep them disabled.
  • ARIMA slow fit — Reduce arima_p/d/q ranges or rely on the adaptive downsizing for short histories.
  • Bands look too tight (multiplicative mode, small pools) — Consider quarter pooling, increasing simulations, or using additive mode until more residuals accumulate.
  • Imputation messages at INFO — Many imputed cells/rows suggest some model columns were missing; investigate upstream model predictions. The saved meta/state include a summary of imputation diagnostics.
  • Excel image missing — If openpyxl cannot embed the PNG, the workbook still writes numeric content.
  • Progress bars in CI logs — Set show_progress=False.

Extending the system

Add a new model:

from __future__ import annotations
import numpy as np, pandas as pd
from app.models.base import ForecastModel
from app.models.registry import register_model

@register_model(style=("dashdot","C7"))
class MyCoolModel(ForecastModel):
    def fit(self, full_df: pd.DataFrame, seasonal_pivot: pd.DataFrame | None = None):
        self._last_date = pd.to_datetime(full_df["ds"]).max()
        # train...

    def predict(self, periods: int) -> pd.DataFrame:
        idx = pd.date_range(self._last_date + pd.offsets.MonthBegin(1), periods=periods, freq="MS")
        yhat = np.zeros(len(idx), dtype=float)
        return pd.DataFrame({"forecast": yhat, "ci_lower": yhat, "ci_upper": yhat}, index=idx).rename_axis("ds")

Registering via @register_model makes it discoverable. Then add "MyCoolModel" to ForecastConfig.models.

Add a custom aggregator:

  • Implement AggregationFunctional (see app/core/aggregation.py) and wire it through AggregatorFactory for AUTO_CC or set cfg.aggregator for FIXED.

Ridge interpretability:

  • Use RidgeStackingFunctional.normalized_weights("abs"|"softmax") to present normalized, non‑negative weights for audit; predictions still use true coefficients.

Security note

Forecast state is stored as a pickle (state.pkl). Do not load pickles from untrusted sources. For portability, use the CSV/Parquet/JSON artifacts. The pipeline records key configuration (including residual_ci_error_mode) in meta.json for auditability across runs.


FAQ

  • Do I need budgets?
    • No. Budgets are optional; the adapter uses actuals for seasonality. Budgets can be integrated as features in custom models if needed.
  • Can I forecast partial years?
    • Yes. The pipeline forecasts only the missing months in forecast_year and preserves known actuals in outputs. Full‑year roll‑up merges actuals + forecasts for 12 months.
  • How reproducible are results?
    • Set random_state in ForecastConfig. Some external libs (Prophet/NeuralProphet) may still introduce slight nondeterminism.
  • How big should my history be?
    • At least 2–3 full years help; the pipeline has month‑of‑year fallbacks to avoid NaNs.
  • Additive vs multiplicative CI — which to choose?
    • Additive uses absolute error pools; multiplicative uses relative shocks (r = (y − ŷ)/|y|). Multiplicative bands scale with level and collapse to 0 when the point is 0, which is often desirable for non‑negative business series.

License

MIT.

This project keeps the core open and clean. Place proprietary data or adapters outside the core if required by your organization’s policies.

About

Python 3.13.3 Time Series Forecaster

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages