November, 2022: We decoupe dataset and hyper-parameters by moving hyper-parameters from src/run.py and src/traintest.py to egs/{audioset,esc50,speechcommands}/run.sh, so that it is easier to adapt the ...
Please cite our paper (s) if you find this repository useful. The first paper proposes the Audio Spectrogram Transformer while the second paper describes the training pipeline that we applied on AST ...
Abstract: Audio event has a hierarchical architecture in both time and frequency and can be grouped together to construct more abstract semantic audio classes. In this work, we develop a multiscale ...
Abstract: Transformer-based audio self-supervised learning (SSL) models often operate on spectrogram-based time-frequency representations and apply vision-style Transformer architectures (e.g., ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results