燕南天 發表於 2014-1-2 14:52:00

Music Identification with Weighted Finite-state Transducers



Music identification is the process of matching an audio stream to a particular song. Previous work has relied on hashing, where an exact or almost-exact match between local features of the test and reference recordings is required. In this work we present a new approach to music identification based on finite-state transducers and Gaussian mixture models. We apply an unsupervised training process to learn an inventory of music phone units similar to phonemes in speech. We also learn a unique sequence of music units characterizing each song. We further propose a novel application of transducers for recognition of music phone sequences. Preliminary experiments demonstrate an identification accuracy of 99:5% on a database of over 15;000 songs running faster than real time.

Index Terms:Music identification, acoustic modeling, finite-state transducers.
Authors:Eugene Weinstein, Pedro Moreno, Google Inc.
頁: [1]
查看完整版本: Music Identification with Weighted Finite-state Transducers