TunaFish does non-realtime pitch
correction to monophonic .wav files, hopefully without introducing audible
For example, if you have a monophonic recording of a singer that's
off-pitch in places, you can use TunaFish to correct the errors and put
the singer back in tune. TunaFish should work "cleanly", so
average listener can't tell it's been altered.
TunaFish is written written in Java, so it runs on
Windows, Linux and OS X.
What Can't TunaFish Do?
It can't work miracles.
It's not realtime.
It can only work with monophonic music.
It can only handle one file at a time.
It can only works with monophonic .wav files.
It can't be used as a plugin to your music program.
Currently TunaFish is in the early code stage, so it's not really
usable. Here's a task list of things I expect need to be
done before TunaFish is usable, and their current status:
Create a trivial Swing window for TunaFish with pull-down
Load monophonic .wav
Playback the .wav
Trivially modify the .wav
change the volume or pitch)
Playback the modified .wav
Save the modified .wav
Display the waveform in a scrollable window
Perform pitch analysis
Implement retune speed logic
Write cubic spline resampling
Zoom in and out of the wave
Add a timeline for playback.
Display pitch analysis
Perform pitch correction
Make the user interface interactive (i.e.: the user can
select notes to adjust)
Clean up the user interface and make the application usable
Before TunaFish can correct the pitch, it first has to determine what pitch is being
sung at a particular point in time. Common approaches include:
bandpass filter to remove harmonics higher than the fundamental, and
then searches for sign changes (where the wave moves up instead of
down, or down instead of up) and measures the distance between the
waves. Because wave shapes are complex and change over time, this method tends to be the most error-prone.
to find the signal phase by "sliding" the target wave forward until it
comes into phase with another segment of the wave. The distance between
the start and the point of alignment is the frequency of that wave.
the wave for harmonic content. This is quite popular, but it requires
an understanding of FFT of DFT. One of these days I'll get around to
learning FFT, but probably not for this project.
TunaFish uses autocorrelation, mainly because it give excellent results. It's also trivial to implement.
Once TunaFish has determined where the pitch should be, it becomes
necessary to resynthesize a new wave at the new frequency. Again, there
are a number of available approaches, including:
analysis can be
used to determine the harmonic analysis of the wave, and recreate the
wave at a new frequency. It also has a tendancy to introduce
"smearing" and "pre-echo" audio artifacts into the sound, which makes
it less than an ideal candidate.
Overlap Add) reuses cycles from the original wave to recreate the wave
by spacing the cycles at the new frequency. Pitch is shifted without
changing the characteristics of the sound (i.e.: the spacing of the formants is preserved), but it introduces a distinct "chorus"
effect, which make it obvious the sound has been processed.
the original samples at the new frequency. Unlike fourier analysis and
PSOLA, it alters all
the frequencies in the sound, instead of only altering the fundamental
frequency, which at the extreme range can lead to the singer sounding
like a chipmunk.
The primary argument for using a more complex method (i.e.: fourier
analysis or PSOLA) is to properly handle formant correction.
However, because pitches shouldn't be moved by more than a semitone or
so, this shouldn't be much of an issue.
resampling introduces the least number of audible artifacts, it's the
preferred method for changing the pitch.
Finding the Target Pitch
approach to pitch correction would be to have TunaFish clamp all
out-of-tune pitches to their correct pitch. Unfortunately, this leads
to the so-called Cher
effect (named after the use of that effect in the song
a musical instrument like a flute or violin, a singer can't just leap
directly from one pitch to the next. Instead, their voice "slides" from
one pitch to the next.
Further, expressive effects such as
vibrato (which intentionally move the pitch above and below the pitch
center) would be lost if TunaFish clamped pitches to set frequencies.
The simplest solution is to implement a retune speed
parameter, which determines how quickly TunaFish should pull an errant
pitch back to the correct pitch.
If you've got any questions about TunaFish, feel free to drop
me an email.