Structural Attention-based Recurrent VAE (SABeR-VAE) for Highway Vehicle Anomaly Detection

This repository contains the codes for our paper titled "Structural Attention-based Recurrent Variational Autoencoder for Highway Vehicle Anomaly Detection" in AAMAS 2023. For more details, please refer to the project website and arXiv preprint.
Abstract
In autonomous driving, detection of abnormal driving behaviors is essential to ensure the safety of vehicle controllers. Prior works in vehicle anomaly detection have shown that modeling interactions between agents improves detection accuracy, but certain abnormal behaviors where structured road information is paramount are poorly identified, such as wrong-way and off-road driving. We propose a novel unsupervised framework for highway anomaly detection named Structural Attention-based Recurrent VAE (SABeR-VAE), which explicitly uses the structure of the environment to aid anomaly identification. Specifically, we use a vehicle self-attention module to learn the relations among vehicles on a road, and a separate lane-vehicle attention module to model the importance of permissible lanes to aid in trajectory prediction. Conditioned on the attention modules' outputs, a recurrent encoder-decoder architecture with a stochastic Koopman operator-propagated latent space predicts the next states of vehicles. Our model is trained end-to-end to minimize prediction loss on normal vehicle behaviors, and is deployed to detect anomalies in (ab)normal scenarios. By combining the heterogeneous vehicle and lane information, SABeR-VAE and its deterministic variant, SABeR-AE, improve abnormal AUPR by 18% and 25% respectively on the simulated MAAD highway dataset. Furthermore, we show that the learned Koopman operator in SABeR-VAE enforces interpretable structure in the variational latent space. The results of our method indeed show that modeling environmental factors is essential to detecting a diverse set of anomalies in deployment.
Getting Started
Installation
- Using Linux system with GPU CUDA 11.7 (project supports CPU only too)
- Install Python 3.8
- Install required packages and their versions with requirements.txt
Downloading and Processing MAAD Dataset
- Follow the instructions from the againerju/maad_highway Github repository to request the access link to the MAAD Dataset
- Paste the access link into download_dataset.sh
- Run the following commands to download the MAAD dataset:
chmod +x download_dataset.sh ./download_dataset.sh
- Run convert_maad_data.py with arguments to preprocess the MAAD dataset
- The following command creates an unnormalized train dataset from 2 vehicle trajectories:
python convert_maad_data.py --split train --run_name maad
- The following command creates an unnormalized test dataset from 2 vehicle trajectories:
python convert_maad_data.py --split test --run_name maad
- You can create your own validation split of data by taking a mixture of normal and abnormal txt files from the original test sub-folder, and placing them into a
data/maad/original/val
sub-folder you create, and running:python convert_maad_data.py --split val --run_name maad
- The pre-processed dataset split will be saved to
data/maad/maad/<split>/0/0.npy
as a numpy file - Appending
--norm
to the above commands will normalize absolute position coordinates to be within 0 and 1
- The following command creates an unnormalized train dataset from 2 vehicle trajectories:
Running Scripts
Train a SABeR-VAE Model
Run train_vae.py to train a SABeR-VAE model on a dataset.
The following command trains a model on the MAAD dataset train split with GPU:
python train_vae.py --env maad --run_name <model run name> --data_run_name maad --gpu --vae_latent <VAE latent size> --batch_size <batch> --lr <learning rate> --pre_kl_beta <pre-koopman kl regularization weight> --post_kl_beta <post-koopman kl regularization weight> --vva --vva_heads <num VVA heads> --vva_drop <VVA dropout> --vva_out <VVA size> --lva --lva_heads <num LVA heads> --lva_drop <LVA dropout> --lva_out <LVA size> --enc_gru_hid <GRU size> --enc_gru_drop <GRU dropout>
Remove the --gpu
flag to train on CPU.
Every time train_vae.py
is trained with the same model run name
, a new sub-folder with an incremented model iteration number
will be created at the path pretrained/<model run name>/maad/<model iteration number>/
.
-
checkpoints/
- Trained model checkpoints will be saved to this sub-folder within
<model iteration number>/
- Trained model checkpoints will be saved to this sub-folder within
-
progress.csv
- Holds loss logging details
-
model_args.pickle
- Stores the training arguments in pickle form
Get Quantitative Results of SABeR-VAE Model Losses
Run evaluate_vae.py to evaluate a set of models on the same dataset split, and save losses of normal and anomalous points for each model into a CSV file. This is helpful for evaluating the difference between errors of normal and abnormal points of models on validation splits before testing on a test split.
The following command evaluates a set of models on the MAAD dataset split with GPU:
python evaluate_vae.py --eval_name <evaluation name> --env maad --split <val/test> --data_run_name maad --checkpoint <model checkpoint number> --gpu --run_name <model run name 1> <model run name 2> <model run name 3> ...
This script will evaluate the model checkpoint number
of every model iteration under the list of run names in the --run_name
flag, and save a CSV file of loss results to eval_results/<evaluation name>.csv
. Look for models with a larger difference of error between normal and abnormal points.
Get Quantitative Results of AUROC, AUPR, and FPR@95%-TPR and Qualitative Results of the Latent Space and Reconstructions for a SABeR-VAE Model
Run test_vae.py to test accuracy of a model on a datset split with AUROC, AUPR, and FPR@95%-TPR metrics.
The following command tests a model checkpoint, gathers its metrics, and visualizes its post-Koopman latent space.
python test_vae.py --env maad --split <val/test> --data_run_name maad --gpu --run_name <model run name> --run_num <model iteration number> --checkpoint <model checkpoint number>
Output test results are stored at pretrained/<model run name>/maad/<model iteration number>/figs/maad/<split>/<checkpoint>
.
-
pred_anomaly_auroc.csv
- Holds AUROC by anomaly type
-
pred_results.csv
- Holds overall dataset split metrics of AUROC, AUPR-Abnormal, AUPR-Normal, and FPR@95%-TPR
-
pred_results.pickle
- Saves some prediction data
-
roc_curve_pred.png
- ROC curve for whole dataset split
-
<recon/prop>_<normal/abnormal>_latent_space_pred.png
- Latent space visualizations pre- and post-Koopman
Once you visualize what the latent space looks like, you choose a set of points for trajectories you'd like to reconstruct. Add latent_labels
, latent_x
, and latent_y
arguments to the command to annotate the latent space with the chosen latent coordinates, and plot their corresponding trajectories.
python test_vae.py --env maad --split <val/test> --data_run_name maad --gpu --run_name <model run name> --run_num <model iteration number> --checkpoint <model checkpoint number> --latent_labels <GT label 1 (0/1)> <GT label 2> ... --latent_x <x-coord 1> <x-coord 2> ... --latent_y <y-coord 1> <y-coord 2> ...
Providing the additional arguments to the command will create a sub-folder to hold visualization diagrams at pretrained/<model run name>/maad/<model iteration number>/figs/maad/<split>/<checkpoint>/figs_<figure generation iteration>
.
-
annotated_latent.png
- An annotated post-Koopman latent space
-
<latent point number>/
- an incremental sub-folder is created for each trajectory to be plotted-
trajectory_<latent point number>.png
- A plot of ground truth and predicted trajectories
-
attention_<latent point number>.png
- Visualizations of VVA and LVA weights
-
koopman_<latent point number>.png
- Plots Koopman mean and variance matrices
-
loss_curve_<latent point number>.png
- Loss curves for the trajectory window for all agents
-
latent_sequence_<latent point number>.png
- Sequential propagation of the latent space for the window agents
-
NOTE: The test script plots a 2D latent space assuming the original model's latent space is also 2D. If your latent space is greater, you should plot a PCA or t-sne of the original vectors.
Other Scripts
We provide more scripts to train and test other ablation models.
-
train_ae.py
- Train an unregularized autoencoder model on prediction loss (RAE-Pred, VV-RAE, and SABeR-AE in the paper)
-
evaluate_ae.py
- Serves the same purpose as evaluate_vae.py but for autoencoder models
-
test_ae.py
- Serves the same purpose as test_vae.py but for autoencoder models
-
train_lstm_vae.py
- Train an attention-based LSTM-VAE ablation model on prediction loss (replaces the Koopman propagation in SABeR-VAE with a recurrent decoder)
-
test_lstm_vae.py
- Serves the same purpose as test_vae.py but for LSTM-VAE ablation models
Commands to Train Models for Paper Results
The following commands were run to train models with the hyperparameters used in the paper.
-
python train_ae.py --env maad --run_name rae_pred --data_run_name maad --gpu --vae_latent 2 --num_epochs 500 --batch_size 32 --seq_len 15 --lr 5e-05 --weight_decay 1e-06 --max_veh 2 --enc_gru_hid 64 --enc_gru_layers 1 --enc_gru_drop 0.0
-
python train_ae.py --env maad --run_name vv_rae --data_run_name maad --gpu --vae_latent 2 --num_epochs 500 --batch_size 128 --seq_len 15 --lr 0.0005 --weight_decay 1e-06 --vva --max_veh 2 --veh_dist_mask 45 --vva_heads 8 --vva_drop 0.0 --vva_out 32 --enc_gru_hid 32 --enc_gru_layers 1 --enc_gru_drop 0.0
-
python train_ae.py --env maad --run_name saber_ae --data_run_name maad --gpu --vae_latent 64 --num_epochs 500 --batch_size 64 --seq_len 15 --lr 5e-05 --weight_decay 1e-06 --vva --max_veh 2 --veh_dist_mask 45 --vva_heads 8 --vva_drop 0.0 --vva_out 64 --lva --lva_heads 8 --lva_drop 0.0 --lva_out 64 --enc_gru_hid 64 --enc_gru_layers 1 --enc_gru_drop 0.0
-
python train_vae.py --env maad --run_name saber_vae --data_run_name maad --gpu --vae_latent 2 --num_epochs 500 --batch_size 32 --seq_len 15 --lr 5e-05 --weight_decay 1e-06 --pre_kl_beta 1e-06 --post_kl_beta 1e-06 --vva --max_veh 2 --veh_dist_mask 45 --vva_heads 8 --vva_drop 0.0 --vva_out 32 --lva --lva_heads 8 --lva_drop 0.0 --lva_out 32 --enc_gru_hid 32 --enc_gru_layers 1 --enc_gru_drop 0.0
-
python train_lstm_vae.py --env maad --run_name att_lstm_vae --data_run_name maad --gpu --vae_latent 2 --num_epochs 500 --batch_size 32 --seq_len 15 --lr 5e-05 --weight_decay 1e-06 --pre_kl_beta 1e-06 --post_kl_beta 1e-06 --vva --max_veh 2 --veh_dist_mask 45 --vva_heads 8 --vva_drop 0.0 --vva_out 32 --lva --lva_heads 8 --lva_drop 0.0 --lva_out 32 --enc_gru_hid 32 --enc_gru_layers 1 --enc_gru_drop 0.0
Citation
If you find the code or the paper useful for your research, please cite our paper:
@inproceedings{chakraborty2023saber,
title={Structural Attention-based Recurrent Variational Autoencoder for Highway Vehicle Anomaly Detection},
author={Chakraborty, Neeloy and Hasan, Aamir and Liu, Shuijing and Ji, Tianchen and Liang, Weihang and McPherson, D. Livingston and Driggs-Campbell, Katherine},
booktitle={IFAAMAS International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},
year={2023}
}
Contributors
Neeloy Chakraborty
Aamir Hasan
Shuijing Liu
Tianchen Ji
Eric Liang
Dr. D. Livingston McPherson
Professor Katie Driggs-Campbell
Part of the code is based on the following repositories: