How It Works

Image
Image
Image
Image

The Process

The analyzer takes an input wav file consisting of piano block chords. This audio signal is then broken up into frames based on onset detection where each frame contains a chord. For each frame, the Harmonic Product Spectrum algorithm is applied to reduce overtones. A chroma vector is created for each frame consisting of a 12-element vector containing the pitches of the chord. This vector is then used as a 12-D feature vector for our Nearest Neighbor Classifier to find the closest matching chord name in our training set of 108 manually-created chord vectors. Once we have all the chord names in a list, we then apply our modified version of the Krumhansl-Schmuckler Key-Finding algorithm to determine the key of the chord progression. Using the key, we calculate the "distance" between the chord root and the key. This "distance" along with the quality of the chord (major,minor,etc.) from the chord name returns the proper Roman Numeral for each chord.

Harmonic Product Spectrum

The Harmonic Product Spectrum is a method for reducing overtones in an acoustic audio signal. It works by multiplying the original signal by a series of downsampled version of the fourier transform of the original signal "n" times. For an acoustic piano, we found that multiplying the original signal three times by three downsampled signals provided the most accurate results.

Krumhansl-Schmuckler Key-Finding Algorithm

The Krumhansl-Schmuckler algorithm is the algorithm we used to determine the key of a chord progression. The algorithm works by using the Krumhansl-Kessler key profiles, which are statistically measured values representing how often a certain solfege pitches will occur in a key. It takes these key profiles as well as the durations of each pitch in the audio signal and computes the correlation coefficient for each set of points consisting of (key pitch profile, duration). The key with the strongest correlation coefficient is most likely the key of the chord progression. This algorithm works well with melodic signals but tended to fall short on chord progressions with only a few chords. We modified this algorithm by taking the final Roman Numeral output for the top three possible keys and applying a scoring method based on the appearance of certain chords and cadences. The key with the highest score is most likely the key.

References

Lee, Kyogu. "Automatic Chord Recognition from Audio Using Enhanced Pitch Class Profile." ICMC. 2006.

Temperley, David. “What's Key for Key? The Krumhansl-Schmuckler Key-Finding Algorithm Reconsidered.” Music Perception: An Interdisciplinary Journal, vol. 17, no. 1, 1999, pp. 65–100. JSTOR, www.jstor.org/stable/40285812.