DREAM

Benchmarks

Performance comparison with LSTM and Transformer

Benchmarks

Comprehensive performance comparison of DREAM vs LSTM vs Transformer.

Overview

We evaluated DREAM on three audio tasks comparing against standard baselines:

  • LSTM (2-layer, 256 hidden)
  • Transformer (4-layer, d_model=128)

Test Suite

Test 1: Basic ASR Reconstruction

Task: Reconstruct mel spectrograms from 9 audio files.

Setup:

  • Input: 80 mel bins, 1014 frames
  • Training: 100 epochs
  • Metric: Reconstruction loss (MSE)

Results:

ModelParametersInitial LossFinal LossImprovementTime
DREAM82K0.92980.001099.9%502s
LSTM893K0.78890.047893.9%9s
Transformer551K0.94160.069692.6%11s

Training Curves:

Epoch 20:  DREAM=0.024, LSTM=0.210, Transformer=0.190
Epoch 40:  DREAM=0.006, LSTM=0.131, Transformer=0.133
Epoch 60:  DREAM=0.003, LSTM=0.089, Transformer=0.104
Epoch 80:  DREAM=0.002, LSTM=0.063, Transformer=0.084
Epoch 100: DREAM=0.001, LSTM=0.048, Transformer=0.070

Conclusion: DREAM achieves lowest final loss (99.9% improvement) but requires more training time due to online adaptation.


Test 2: Speaker Adaptation

Task: Adapt to speaker change mid-sequence.

Setup:

  • Concatenate two different speakers
  • Measure steps to recover baseline loss
  • Target: {'<'}50 steps (Spec 7.5)

Results:

ModelBaseline LossMax Post-SwitchAdapt StepsSurprise Spike
DREAM1.20781.965700.119
LSTM1.04351.58070N/A
Transformer1.19631.69630N/A

Conclusion: All models adapt instantly (0 steps), but only DREAM detects change via surprise spike.


Test 3: Noise Robustness

Task: Reconstruction with additive white noise.

Setup:

  • SNR levels: 20dB, 10dB, 5dB, 0dB
  • Metric: Loss ratio (10dB / clean)
  • Target: {'<'}3× ratio

Results:

ModelClean (20dB)10dB LossRatioSurprise Response
DREAM1.23081.33901.09×❌ No
LSTM1.01631.10521.09×N/A
Transformer1.28671.37571.07×N/A

Surprise Response by SNR (DREAM):

SNR 20dB: Max Surprise = 0.973
SNR 10dB: Max Surprise = 0.987
SNR  5dB: Max Surprise = 0.995
SNR  0dB: Max Surprise = 1.000

Conclusion: DREAM is stable under noise (1.09×), surprise increases with noise but saturates.


Summary

Overall Performance

TestDREAMLSTMTransformerTarget
ASR Improvement99.9%93.9%92.6%{'>'}90%
Adaptation Steps000{'<'}50
Noise Ratio1.09×1.09×1.07×{'<'}3×

Key Findings

  1. DREAM achieves best reconstruction quality (99.9% vs 93-94%)
  2. Instant adaptation to speaker changes (0 steps)
  3. Stable under noise (1.09× ratio at 10dB SNR)
  4. Fewer parameters (82K vs 551-893K)

Trade-offs

AspectDREAMBaselines
Quality✅ BestGood
Training Speed❌ Slower (502s)✅ Fast (9-11s)
Parameters✅ 82K❌ 551-893K
Online Adaptation✅ Yes❌ No

Running Benchmarks

Quick Start

# Run all benchmarks (15-30 minutes)
uv run python tests/benchmarks/run_all.py

# Individual tests
uv run python tests/benchmarks/test_01_basic_asr.py
uv run python tests/benchmarks/test_02_speaker_adaptation.py
uv run python tests/benchmarks/test_03_noise_robustness.py

# Generate visualizations
uv run python tests/benchmarks/visualize.py

Output Files

After running:

tests/benchmarks/results/
├── results_basic_asr.json
├── results_speaker_adaptation.json
├── results_noise_robustness.json
├── figures/
│   ├── fig1_training_curves.pdf
│   ├── fig2_speaker_adaptation.pdf
│   ├── fig3_noise_robustness.pdf
│   └── benchmark_table.tex
└── BENCHMARK_REPORT.md

Hardware Requirements

Minimum:

  • 8GB RAM
  • CPU (slower)

Recommended:

  • GPU with 4GB+ VRAM
  • 16GB RAM
  • SSD

Estimated Runtime:

  • Test 1 (ASR): ~5-10 min per model
  • Test 2 (Adaptation): ~1 min per model
  • Test 3 (Noise): ~2 min per model
  • Total: ~15-30 minutes

Reproducibility

Environment

# Python
Python 3.10+

# Dependencies
torch>=2.0.0
numpy>=1.24.0
librosa>=0.10.0  # for audio tests

Dataset

10 audio files from LJ Speech-like corpus:

  • Sample rate: 16kHz
  • Features: Mel spectrogram (80 bins)
  • Duration: ~3-10 seconds each

Hyperparameters

DREAMConfig(
    input_dim=80,
    hidden_dim=256,
    rank=16,
    forgetting_rate=0.005,
    base_plasticity=0.5,
    base_threshold=0.3,
    ltc_tau_sys=5.0,
    ltc_surprise_scale=5.0,
)

Next Steps

On this page