Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Humming Transcription • Humming transcription – Our front-end for music search – Goal: Convert humming into an MIDI note sequence • Steps involved in humming transcription 1. Track pitch over the humming input 2. Identify notes via silence 3. Find the best MIDI sequence (of integer semitones) via key transposition Pitch Tracking over the Humming 4 • Pitch tracking methods • Example: – Humming input – Identified pitch vector – Notes are identified by volume or clarity thresholding 2 0 -2 1 2 3 4 5 6 7 8 9 10 6 7 8 9 10 6 7 8 9 10 5 6 Time (sec) 7 8 9 10 Computed pitch Pitch (semitone) – ACF-based peak picking – ACF-based dynamic programming Waveform of roger.wav x 10 60 55 50 1 2 3 4 6 4 5 Volume x 10 2 0 1 2 3 4 5 Clarity 1 0.5 0 1 2 3 4 MIDI Sequence Identification 0.5 • We need to represent each note as an integer semitone (or the so-called MIDI number) Mean deviation (semitone) 0.45 – Perform exhaustive search on pitch shift to identify the best integer semitones 0.25 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 Shift (semitone) 60 0.2 0.3 0.4 0.5 Best shifted pitch vector Best identified MIDI numbers 58 Pitch (Semitone) 2. Notes obtained as medians from pitch vector: [50.56, 49.45, 56.46, 56.36, 58.82, 58.65, 56.70, 54.74, 54.93, 53.85, 53.70, 51.53, 51.73, 49.62] Best integer notes after key transposition: [50, 50, 57, 57, 59, 59, 57, 55, 55, 54, 54, 52, 52, 50] 0.35 0.3 • Example 1. 0.4 56 54 52 50 1 2 3 4 5 6 Time (sec) 7 8 9 10 Humming Transcription • Humming transcription is our front-end for music search. • The process for humming transcription 1. Typical humming 2. Pitch vector obtained from ACF: [0, 0, 49.5336, 49.4312, 49.4817, 50.0910, 0, 50.5623, ...] 3. Median notes obtained from pitch vector: [50.5623, 49.4559, 56.4628, 56.3636, 58.8217, 58.6510, 56.7025, 54.7415, 54.9399, 53.8507, 53.7002, 51.5369, 51.7357, 49.6283] 4. Best integer notes after key transposition: [50, 50, 57, 57, 59, 59, 57, 55, 55, 54, 54, 52, 52, 50]