Overview
Autoformer is a Transformer-based model designed for long-term time series forecasting. It addresses the limitations of prior Transformer models, which struggled with intricate temporal patterns and efficiency for long sequences. Autoformer introduces a novel decomposition architecture and an Auto-Correlation mechanism, allowing it to progressively decompose trend and seasonal components during the forecasting process. This approach significantly enhances its ability to capture complex temporal patterns and achieves state-of-the-art accuracy, particularly in long-horizon forecasting.
Architecture & Components
Autoformer's architecture builds upon the Transformer framework with two primary innovations:
- Deep Decomposition Architecture: Autoformer renovates the Transformer into a deep decomposition architecture. Unlike models that perform decomposition as a separate preprocessing step, Autoformer adaptively separates the raw time series signal into its trend and seasonal components *within* the model during both training and inference. The trend component ($x_{trend}$) is typically extracted by applying a moving average (MA) filter with a specified kernel size ($k$). This inductive bias improves forecasting by allowing the model to learn each component more effectively.
- Series-wise Auto-Correlation Mechanism: Inspired by stochastic process theory, Autoformer replaces the conventional self-attention mechanism with an Auto-Correlation mechanism. This mechanism discovers period-based dependencies by comparing similar sub-sequences (e.g., aligning Mondays with other Mondays) rather than every time step with each other. This not only reduces computational complexity from quadratic to log-linear ($O(L \log L)$) but also aligns better with the periodic nature of many real-world time series. This series-wise connection inherently keeps sequential information, meaning Autoformer does not need explicit position embeddings like other Transformers.
- Encoder-Decoder Structure: Autoformer retains the standard Transformer encoder-decoder structure. The encoder processes the input sequence, and the decoder generates the forecast, leveraging the decomposed components and the Auto-Correlation mechanism.
Autoformer is a deterministic model, providing a single point forecast rather than a distribution of possible future values.
Conceptual diagram of Autoformer's architecture with decomposition and Auto-Correlation.
When to Use Autoformer
Autoformer is an excellent choice for:
- Long-horizon time series forecasting problems where trends and seasonality unfold over extended periods.
- Structured and periodic data, where its Auto-Correlation mechanism can effectively identify recurring patterns.
- Scenarios requiring robust performance in both clean and noisy environments, as its decomposition mechanism helps filter high-frequency noise.
- When interpretability of trend and seasonal components is desired, as it explicitly models them.
- As a state-of-the-art model for various practical applications including energy, traffic, economics, weather, and disease forecasting.
Pros and Cons
Pros
- State-of-the-Art for Long-Term Forecasting: Achieves significant relative improvements on various benchmarks.
- Superior Noise Resilience: Its built-in decomposition mechanism effectively filters high-frequency noise, maintaining predictive stability even in perturbed environments.
- Interpretable Decomposition: Explicitly separates and models trend and seasonal components, offering insights into the forecast.
- Efficient: The Auto-Correlation mechanism reduces computational complexity to log-linear ($O(L \log L)$), making it efficient for long sequences.
- No Positional Embedding Needed: Inherently preserves sequential information, simplifying the architecture.
Cons
- Deterministic Output: Provides only a single point forecast, limiting its ability to quantify forecast uncertainty (no probability distribution).
- Requires PyTorch: Official and most common implementations are in PyTorch, limiting direct TensorFlow usage without adaptations.
- Data Requirements: Like most deep learning models, it generally requires sufficient historical data for optimal performance.
Example Implementation
Autoformer is primarily implemented in PyTorch, with the official code provided by THUML. HuggingFace Transformers and NeuralForecast also offer implementations. Here's a conceptual example using the THUML repository's approach, which typically involves running bash scripts for specific datasets.
PyTorch Example (using THUML/Autoformer)
# 1. Clone the official Autoformer repository
# git clone https://github.com/thuml/Autoformer.git
# cd Autoformer
# 2. Install Python 3.6 and PyTorch 1.9.0 (or compatible versions)
# pip install -r requirements.txt # (assuming a requirements.txt exists or install manually)
# 3. Download datasets
# The datasets are typically provided via a Google Drive link in the repository's README.
# Download them and place them in a './dataset' folder in the root of the cloned repository.
# Example: make get_dataset (if using their Makefile)
# 4. Run a training script for a specific dataset (e.g., ETTm1)
# These scripts are located in the './scripts' directory.
echo "Running Autoformer training script for ETTm1 dataset..."
bash./scripts/ETT_script/Autoformer_ETTm1.sh
# This script will typically:
# - Set up model parameters (e.g., sequence length, prediction length, number of encoder/decoder layers)
# - Load the ETTm1 dataset
# - Train the Autoformer model
# - Evaluate its performance (RMSE, MAE) and save results to './result.txt' or similar.
# Example of what the script might contain (simplified):
# python main_long_term_forecast.py \
# --model Autoformer \
# --data ETTm1 \
# --features M \
# --seq_len 96 \
# --label_len 48 \
# --pred_len 24 \
# --e_layers 2 \
# --d_layers 1 \
# --factor 3 \
# --enc_in 7 \
# --dec_in 7 \
# --c_out 7 \
# --des Exp \
# --itr 1 \
# --train_epochs 10 \
# --batch_size 32 \
# --learning_rate 0.0001 \
# --root_path./dataset/ETT-small/ \
# --data_path ETTm1.csv \
# --checkpoints./checkpoints/
echo "Autoformer training script executed. Check './result.txt' or specified output directory for results."
# For inference, you would typically load a trained model checkpoint and use its predict method.
# The repository's 'predict.ipynb' (in Chinese) provides a workflow example.