Guitar audio transcription is the process of generating a human-interpretable musical score from guitar audio. The musical score is presented as guitar tablature, which indicates not only what notes are played, but where they are played on the guitar fretboard. Automatic transcription remains a challenge when dealing with polyphonic sounds. The guitar adds further ambiguity to the transcription problem because the same note can often be played in many ways. In this thesis work, a portable software architecture is presented for processing guitar audio in real time and providing a set of highly probable transcription solutions. Novel algorithms for performing polyphonic pitch detection and generating confidence values for transcription solutions (by which they are ranked) are also presented. Transcription solutions are generated for individual signal windows based on the output of the polyphonic pitch detection algorithm. Confidence values are generated for solutions by analyzing signal properties, fingering difficulty, and proximity to previous highest confidence solutions.
The rules used for generating confidence values are based on expert knowledge of the instrument. Performance is measured in terms of algorithm accuracy, latency, and throughput. The correct result is ranked 2.08 (with the top rank being 0) for chords. The general case of various notes over time presents results that require qualitative analysis; the system in general is very susceptible to noise and has a difficult time distinguishing harmonics from actual fundamentals. By allowing the user to seed the system with a ground truth, correct recognition of future states is improved significantly in some cases. The sampling time is 250 ms with an average processing time of 110 ms, giving an average total latency of 360 ms. Throughput is 62.5 sample windows per second. Performance is not processor-bound, enabling high performance on a wide variety of personal computers.