Major

Data Science

Research Abstract

With the recent advancements in technology, many tasks in fields such as computer vision, natural language processing, and signal processing have been solved using deep learning architectures. In the audio domain, these architectures have been used to learn musical features of songs to predict: moods, genres, and instruments. In the case of genre classification, deep learning models were applied to popular datasets--which are explicitly chosen to represent their genres--and achieved state-of-the-art results. However, these results have not been reproduced on less refined datasets. To this end, we introduce an un-curated dataset which contains genre labels and 30-second audio previews for approximately fifteen thousand songs from Spotify. In our work, we focus on solving automatic genre classification using deep learning and crude data. Specifically, we propose deep architectures that learn hierarchical characteristics of music using raw waveform audio rather than preprocessed audio in the form of mel-spectrograms and apply these models to the Spotify dataset. Our experiments show how deep learning architectures using unpolished data can achieve comparable results to previous state-of-the-art music classifiers using filtered data.

Faculty Mentor/Advisor

David Guy Brizan

Share

COinS
 
Apr 26th, 9:10 AM Apr 26th, 11:59 PM

Deep Neural Network Architectures For Music Genre Classification

With the recent advancements in technology, many tasks in fields such as computer vision, natural language processing, and signal processing have been solved using deep learning architectures. In the audio domain, these architectures have been used to learn musical features of songs to predict: moods, genres, and instruments. In the case of genre classification, deep learning models were applied to popular datasets--which are explicitly chosen to represent their genres--and achieved state-of-the-art results. However, these results have not been reproduced on less refined datasets. To this end, we introduce an un-curated dataset which contains genre labels and 30-second audio previews for approximately fifteen thousand songs from Spotify. In our work, we focus on solving automatic genre classification using deep learning and crude data. Specifically, we propose deep architectures that learn hierarchical characteristics of music using raw waveform audio rather than preprocessed audio in the form of mel-spectrograms and apply these models to the Spotify dataset. Our experiments show how deep learning architectures using unpolished data can achieve comparable results to previous state-of-the-art music classifiers using filtered data.