A Comprehensive Review on Advancements and Challenges in Audio Classification Through Deep Learning

Deep learning-based audio classification has transformed the industry with improved speech recognition, genre identification in music, and ambient sound detection. The article explores various approaches, including model architectures, evaluation metrics, and preprocessing techniques. Traditional methods are compared to deep learning techniques, which have enhanced performance. Spectrograms, Mel-Frequency Cepstral Coefficients, and Short-Time Fourier Transform are discussed as preprocessing techniques. The study also evaluates hybrid model architectures, training methods, data augmentation, and transfer learning for better outcomes. The paper emphasises the importance of interpretability, stable datasets, and real-time processing for overcoming challenges in audio classification. It is expected to guide future research and advancements in this field.

MoreLess

Year of publication:	2025
Authors:	Verma, Gunjan ; Gocher, Honey ; Verma, Sweety ; Singh, Yudhveer ; Kaushik, Arti ; Goswami, Avinash
Published in:	Human-Centric AI in Digital Transformation and Entrepreneurship. - IGI Global Scientific Publishing, ISBN 9798369380116. - 2025, p. 137-160

More details

Type of publication:	Article
Type of publication (narrower categories):	chapter
Language:	English
Other identifiers:	10.4018/979-8-3693-8009-3.ch007 [DOI]
Source:	Other ZBW resources

Persistent link: https://www.econbiz.de/10015540453