TunaFish

"You can tune a piano, but you can't tune a fish"

What's TunaFish?

TunaFish does non-realtime pitch correction to monophonic .wav files, hopefully without introducing audible artifacts.

For example, if you have a monophonic recording of a singer that's off-pitch in places, you can use TunaFish to correct the errors and put the singer back in tune. TunaFish should work "cleanly", so the average listener can't tell it's been altered.

TunaFish is written written in Java, so it runs on Windows, Linux and OS X.

What Can't TunaFish Do?

It can't work miracles.
It's not realtime.
It can only work with monophonic music.
It can only handle one file at a time.
It can only works with monophonic .wav files.
It can't be used as a plugin to your music program.

Task List

Currently TunaFish is in the early code stage, so it's not really usable. Here's a task list of things I expect need to be done before TunaFish is usable, and their current status:

Finished

Create a trivial Swing window for TunaFish with pull-down menus.
Load monophonic .wav files
Playback the .wav file
Trivially modify the .wav file (ex: change the volume or pitch)
Playback the modified .wav file
Save the modified .wav file
Display the waveform in a scrollable window
Perform pitch analysis
Implement retune speed logic
Write cubic spline resampling

In Process

Zoom in and out of the wave
Add a timeline for playback.
Display pitch analysis

Pending

Perform pitch correction
Make the user interface interactive (i.e.: the user can select notes to adjust)
Clean up the user interface and make the application usable
Write documentation

Technological Overview

Determining the Pitch

Before TunaFish can correct the pitch, it first has to determine what pitch is being sung at a particular point in time. Common approaches include:

Zero-crossing applies a bandpass filter to remove harmonics higher than the fundamental, and then searches for sign changes (where the wave moves up instead of down, or down instead of up) and measures the distance between the waves. Because wave shapes are complex and change over time, this method tends to be the most error-prone.
Autocorrelationtries to find the signal phase by "sliding" the target wave forward until it comes into phase with another segment of the wave. The distance between the start and the point of alignment is the frequency of that wave.
Fourier Analysis analyzes the wave for harmonic content. This is quite popular, but it requires an understanding of FFT of DFT. One of these days I'll get around to learning FFT, but probably not for this project.

TunaFish uses autocorrelation, mainly because it give excellent results. It's also trivial to implement.

Changing the Pitch

Once TunaFish has determined where the pitch should be, it becomes necessary to resynthesize a new wave at the new frequency. Again, there are a number of available approaches, including:

Fourier analysis can be used to determine the harmonic analysis of the wave, and recreate the wave at a new frequency. It also has a tendancy to introduce "smearing" and "pre-echo" audio artifacts into the sound, which makes it less than an ideal candidate.
PSOLA (Pitch-Synchronous Overlap Add) reuses cycles from the original wave to recreate the wave by spacing the cycles at the new frequency. Pitch is shifted without changing the characteristics of the sound (i.e.: the spacing of the formants is preserved), but it introduces a distinct "chorus" effect, which make it obvious the sound has been processed.
Resample the original samples at the new frequency. Unlike fourier analysis and PSOLA, it alters all the frequencies in the sound, instead of only altering the fundamental frequency, which at the extreme range can lead to the singer sounding like a chipmunk.

The primary argument for using a more complex method (i.e.: fourier analysis or PSOLA) is to properly handle formant correction. However, because pitches shouldn't be moved by more than a semitone or so, this shouldn't be much of an issue.

Since resampling introduces the least number of audible artifacts, it's the preferred method for changing the pitch.

Finding the Target Pitch

A naive approach to pitch correction would be to have TunaFish clamp all out-of-tune pitches to their correct pitch. Unfortunately, this leads to the so-called Cher effect (named after the use of that effect in the song "Believe").

Unlike a musical instrument like a flute or violin, a singer can't just leap directly from one pitch to the next. Instead, their voice "slides" from one pitch to the next.

Further, expressive effects such as vibrato (which intentionally move the pitch above and below the pitch center) would be lost if TunaFish clamped pitches to set frequencies.

The simplest solution is to implement a retune speed parameter, which determines how quickly TunaFish should pull an errant pitch back to the correct pitch.

Questions?

If you've got any questions about TunaFish, feel free to drop me an email.