TunaFish does non-realtime pitch
correction to monophonic .wav files, hopefully without introducing audible
artifacts.
For example, if you have a monophonic recording of a singer that's
off-pitch in places, you can use TunaFish to correct the errors and put
the singer back in tune. TunaFish should work "cleanly", so
the
average listener can't tell it's been altered.
TunaFish is written written in Java, so it runs on
Windows, Linux and OS X.
What Can't TunaFish Do?
It can't work miracles.
It's not realtime.
It can only work with monophonic music.
It can only handle one file at a time.
It can only works with monophonic .wav files.
It can't be used as a plugin to your music program.
Task List
Currently TunaFish is in the early code stage, so it's not really
usable. Here's a task list of things I expect need to be
done before TunaFish is usable, and their current status:
Finished
Create a trivial Swing window for TunaFish with pull-down
menus.
Load monophonic .wav
files
Playback the .wav
file
Trivially modify the .wav
file (ex:
change the volume or pitch)
Playback the modified .wav
file
Save the modified .wav
file
Display the waveform in a scrollable window
Perform pitch analysis
Implement retune speed logic
Write cubic spline resampling
In Process
Zoom in and out of the wave
Add a timeline for playback.
Display pitch analysis
Pending
Perform pitch correction
Make the user interface interactive (i.e.: the user can
select notes to adjust)
Clean up the user interface and make the application usable
Write documentation
Technological Overview
Determining the
Pitch
Before TunaFish can correct the pitch, it first has to determine what pitch is being
sung at a particular point in time. Common approaches include:
Zero-crossing
applies a
bandpass filter to remove harmonics higher than the fundamental, and
then searches for sign changes (where the wave moves up instead of
down, or down instead of up) and measures the distance between the
waves. Because wave shapes are complex and change over time, this method tends to be the most error-prone.
Autocorrelationtries
to find the signal phase by "sliding" the target wave forward until it
comes into phase with another segment of the wave. The distance between
the start and the point of alignment is the frequency of that wave.
Fourier
Analysis analyzes
the wave for harmonic content. This is quite popular, but it requires
an understanding of FFT of DFT. One of these days I'll get around to
learning FFT, but probably not for this project.
TunaFish uses autocorrelation, mainly because it give excellent results. It's also trivial to implement.
Changing the
Pitch
Once TunaFish has determined where the pitch should be, it becomes
necessary to resynthesize a new wave at the new frequency. Again, there
are a number of available approaches, including:
Fourier
analysis can be
used to determine the harmonic analysis of the wave, and recreate the
wave at a new frequency. It also has a tendancy to introduce
"smearing" and "pre-echo" audio artifacts into the sound, which makes
it less than an ideal candidate.
PSOLA
(Pitch-Synchronous
Overlap Add) reuses cycles from the original wave to recreate the wave
by spacing the cycles at the new frequency. Pitch is shifted without
changing the characteristics of the sound (i.e.: the spacing of the formants is preserved), but it introduces a distinct "chorus"
effect, which make it obvious the sound has been processed.
Resample
the original samples at the new frequency. Unlike fourier analysis and
PSOLA, it alters all
the frequencies in the sound, instead of only altering the fundamental
frequency, which at the extreme range can lead to the singer sounding
like a chipmunk.
The primary argument for using a more complex method (i.e.: fourier
analysis or PSOLA) is to properly handle formant correction.
However, because pitches shouldn't be moved by more than a semitone or
so, this shouldn't be much of an issue.
Since
resampling introduces the least number of audible artifacts, it's the
preferred method for changing the pitch.
Finding the Target Pitch
A naive
approach to pitch correction would be to have TunaFish clamp all
out-of-tune pitches to their correct pitch. Unfortunately, this leads
to the so-called Cher
effect (named after the use of that effect in the song
"Believe").
Unlike
a musical instrument like a flute or violin, a singer can't just leap
directly from one pitch to the next. Instead, their voice "slides" from
one pitch to the next.
Further, expressive effects such as
vibrato (which intentionally move the pitch above and below the pitch
center) would be lost if TunaFish clamped pitches to set frequencies.
The simplest solution is to implement a retune speed
parameter, which determines how quickly TunaFish should pull an errant
pitch back to the correct pitch.
Questions?
If you've got any questions about TunaFish, feel free to drop
me an email.