Methods of Multimodal Fusion for Stress Detection

We compare methods of multimodal fusion in neural networks on the Wearable Stress and Affect Detection (WESAD) dataset

This work was done as part of the seminar in the course ā€œBio-Inspired Artificial Intelligenceā€ in my master studies.

Using the WESAD Dataset for wearable stress and affect detection, we compared different methods of multi-modal fusion in neural networks. Results employing the Gated Multimodal Fusion Unit (GMU), linear sum and concatenation are presented and compared against AdaBoost as a baseline. In our experiment, all neural fusion methods achieved similar performance, while being substantially better than the AdaBoost baseline.

Overview of the model architecture, using the GMU for multimodal fusion
Mean accuracy and F1-score results using leave-one-subject-out cross validation, error bars are indicating standard deviation.

To find out if the GMU behaves differently when random noise is introduced, we carried out a second experiment. Noise was applied on one modality at inference time. Cosequently, the performance of all methods dropped, but, surprisingly, the models using the gated fusion module deteriorated most notably, resulting in below chance-level accuracy.

Mean accuracy and F1-score results using leave-one-subject-out cross validation in the noise condition, error bars are indicating standard deviation.

For more details, feel free to look at our seminar report [PDF]. The source code for our evaluation is available for download here.