MULTI-PITCH DETECTION AND VOICE ASSIGNMENT FOR A CAPPELLA RECORDINGS OF MULTIPLE SINGERS

Rodrigo Schramm, Andrew MCleod, Mark Steedman, Emmanouil Benetos



This project presents a multi-pitch detection and voice assignment method applied to audio recordings containing a cappella performances with multiple singers. A novel approach combining an acoustic model for multi-pitch detection and a music language model for voice separation and assignment is proposed. The acoustic model is a spectrogram factorization process based on Probabilistic Latent Component Analysis (PLCA), driven by a 6-dimensional dictionary with pre-learned spectral templates. The voice separation component is based on hidden Markov models that use musicological assumptions. By integrating the models, the system can detect multiple concurrent pitches in vocal music and assign each detected pitch to a specific voice corresponding to a voice type such as soprano, alto, tenor or bass (SATB). This work focuses on four-part compositions, and evaluations on recordings of Bach Chorales and Barbershop quartets show that our integrated approach achieves an F-measure of over 70% for frame-based multi-pitch detection and over 45% for four-voice assignment.

Example 22_BQ058_part6_mix.wav:

Example 26_BC060_part10_mix.wav:


- ISMIR 2017 (Suzhou/China) paper: 26_Paper.pdf
- source code: musingers_ISMIR2017.zip (matlab+java)
- instructions: run the matlab script "run_experiments.m".
- contact: rschramm@ufrgs.br