Advanced Forecasting with Nixtla: Hierarchical Forecasts

Contents

Advanced Forecasting with Nixtla: Hierarchical Forecasts#

Welcome to this tutorial on advanced forecasting techniques using Nixtla’s tools. Nixtla provides state-of-the-art libraries for time series forecasting, including neural network-based models and hierarchical forecasting methods.

In this notebook, we will cover:

  • Understand cross‑sectional hierarchical structures and coherent forecasts.

  • Train base (unreconciled) forecasts with StatsForecast (AutoARIMA + Naive).

  • Reconcile forecasts using BottomUp, TopDown, and MiddleOut.

  • Evaluate performance by hierarchy level and compare against a benchmark.

  • (Advanced) Explore MinTrace / ERM.

By the end of this tutorial, you will understand these concepts and implement them using Python.

Hierarchical forecasting deals with time series that are organized in a hierarchy (e.g., country → region → store). It ensures coherence between aggregated and disaggregated forecasts.

import pandas as pd
import numpy as np

# Data
from datasetsforecast.hierarchical import HierarchicalData

# Base forecasts (unreconciled)
from statsforecast.core import StatsForecast
from statsforecast.models import AutoETS, AutoARIMA, Naive

# Reconciliation & evaluation
from hierarchicalforecast.core import HierarchicalReconciliation
from hierarchicalforecast.methods import BottomUp, TopDown, MiddleOut, ERM, MinTrace
from hierarchicalforecast.evaluation import evaluate

# Metric(s)
from utilsforecast.losses import mape, mse

Using an example adapted from https://nixtlaverse.nixtla.io/hierarchicalforecast/index.html.

Load the following data:

  • Y_df — long-format series (unique_id, ds, y)

  • S_df — summing/aggregation matrix encoding the hierarchy

  • tags — level name → list of unique_ids for that level

Y_df, S_df, tags = HierarchicalData.load('./data', 'TourismSmall')
Y_df['ds'] = pd.to_datetime(Y_df['ds'])
S_df = S_df.reset_index(names="unique_id")

Examine each in turn, first is the Y_df, this contains 3 columns: the date, the values, and the label.

Y_df

The tags show the level of the tags: Country -> Country/Purpose -> Country/Purpose/State -> Country/Purpose/State/CityNonCity. Each tag contains the list of labels the pertains to that level.

tags

S_df contains the summing matrix, each label shown in the tags are contained here and how they relate to one another.

S_df

With these bits of information a hierarchical forecast can be calculated. Let’s forecast the next 4 quarters of the data:

Y_test_df  = Y_df.groupby('unique_id').tail(4)
Y_train_df = Y_df.drop(Y_test_df.index)

We can fit any forecast models to the data, but since we’ve looked at ETS and baseline models previously, let’s start with those:

fcst = StatsForecast(
    models=[AutoETS(season_length=4), Naive()],
    freq='QE',
    n_jobs=-1,
)
Y_hat_df = fcst.forecast(df=Y_train_df, h=4, fitted=True)
Y_hat_df
fcst.plot(Y_df,Y_hat_df)

Now we can reconcile the forecast, what this essentially means is altering the forecasts so that the total makes sense. There are a few methods to do this:

  • Bottom Up: Takes the lowest level forecasts and aggregates them, adjusting the top levels as it goes. Use this reconciler if the bottom level data is rich and accurate.

  • Top Down: Does the opposite, uses the top level forecast, then splits down using historical proportions. Use this when the bottom level data is sparse.

  • Middle Out: Pick a level to start at, use this to move up the hierarchy in a bottom-up approach, and go down the hierarchy with a top-down approach. Use this when the best data is at one of the middle layers.

Each of these will make the forecast coherent (i.e., all predictions across all levels of a hierarchy satisfy the aggregation constraints, so that the sum of lower-level forecasts exactly equals the forecast at each higher level).

Now we can evaluate the forecasts:

df = Y_hat_df.merge(Y_test_df, on=['unique_id','ds'], how='left')
results = evaluate(df=df, metrics=[mape], tags=tags)
results.sort_values(['metric', 'level'])

Questions#

What does coherence mean in hierarchical forecasting, and why can unreconciled base forecasts violate it?

Method trade‑offs: When might TopDown outperform BottomUp?

Change h=8. Which reconcilers degrade least as horizon increases? Why?

Replace AutoETS with AutoARIMA (or add it in to the list of models) and re‑evaluate. Does reconciliation change which method ranks best at each level?

Next let’s examine automatic ways to select the reconcilers.

  • MinTrace: minimises the variance of the reconciled forecasts by adjusting weighting of the forecasts in the reconciler.

  • Elastic Reconciliation Method: introduces an L1 regularisation term to simplify the reconciliation weights. Lambda specifies a step size to change the weights, so needs careful consideration.

Unlike the approaches mentioned above, these methods use all the forecasts provided and adjust them to optimise the forecast.

mint        = MinTrace(method='mint_shrink')
erm_closed  = ERM(method='closed')
erm_reg     = ERM(method='reg', lambda_reg=0.1)
erm_reg_bu  = ERM(method='reg_bu', lambda_reg=0.1)

hrec = HierarchicalReconciliation(reconcilers=[mint, erm_closed, erm_reg, erm_reg_bu])

Y_rec_df = hrec.reconcile(
    Y_hat_df=Y_hat_df,
    Y_df=fcst.forecast_fitted_values(),
    S_df=S_df,
    tags=tags
)

df_eval = Y_rec_df.merge(Y_test_df, on=['unique_id','ds'], how='left')
results = evaluate(df=df_eval, metrics=[mse], tags=tags, benchmark='Naive')
results.sort_values(['metric','level'])

Questions#

Experiment with the lambda_reg values for erm_reg and erm_bottom_up, see if you can lower the mse.