Implementation of tdstretch class
The soundtouch class member function putsamples (const sampletype * samples, uint nsamples) is implemented as follows. According to the analysis in the previous article, the rate is a ratio. If it is greater than 1, the speed is faster, in this case, the playback speed slows down.
......
# Ifndef prevent_click_at_rate_crossover
Else if (rate <= 1.0f)
{
// Transpose the rate down, output the transposed sound to tempo changer Buffer
Assert (output = ptdstretch );
Pratetransposer-> putsamples (samples, nsamples );
Ptdstretch-> movesamples (* pratetransposer );
}
Else
# Endif
{
// Evaluate the tempo changer, then transpose the rate up,
Assert (output = pratetransposer );
Ptdstretch-> putsamples (samples, nsamples );
Pratetransposer-> movesamples (* ptdstretch );
}
......
First, we use pratetransposer-> putsamples (samples, nsamples) to resample the sound, use the linear interpolation method, and then call ptdstretch-> movesamples (* pratetransposer ); ptdstretch is an instance of the tdstretch class. The tdstretch class is defined as follows:
/// Class that does the time-stretch (Tempo Change) effect for the processed
/// Sound.
Class tdstretch: Public writable oprocessor
{
Protected:
Int channels;
Int samplereq;
Float tempo;
Sampletype * pmidbuffer;
Sampletype * prefmidbuffer;
Sampletype * prefmidbufferunaligned;
Int overlaplength;
Int seeklength;
Int seekwindowlength;
Int overlapdividerbits;
Int slopingdivider;
Float nominalskip;
Float skipfract;
Extends osamplebuffer outputbuffer;
Using osamplebuffer inputbuffer;
Bool bquickseek;
// Int outdebt;
// Bool bmidbufferdirty;
Int samplerate;
Int sequencems;
Int seekwindowms;
Int overlapms;
Bool bautoseqsetting;
Bool bautoseeksetting;
Void acceptnewoverlaplength (INT newoverlaplength );
Virtual void clearcrossnation state ();
Void calculateoverlaplength (INT overlapms );
Virtual long_sampletype calccroscorrstereo (const sampletype * mixingpos, const sampletype * compare) const;
Virtual long_sampletype calccrosw.mono (const sampletype * mixingpos, const sampletype * compare) const;
Virtual int seekbestoverlappositionstereo (const sampletype * refpos );
Virtual int seekbestoverlappositionstereoquick (const sampletype * refpos );
Virtual int seekbestoverlappositionmono (const sampletype * refpos );
Virtual int seekbestoverlappositionmonoquick (const sampletype * refpos );
Int seekbestoverlapposition (const sampletype * refpos );
Virtual void overlapstereo (sampletype * output, const sampletype * input) const;
Virtual void overlapmono (sampletype * output, const sampletype * input) const;
Void clearmidbuffer ();
Void overlap (sampletype * output, const sampletype * input, uint ovlpos) const;
Void precalc#referencemono ();
Void precalc#referencestereo ();
Void calcseqparameters ();
/// Changes the tempo of the given sound samples.
/// Returns amount of samples returned in the "output" buffer.
/// The maximum amount of samples that can be returned at a time is set
/// The 'set _ returnbuffer_size 'function.
Void processsamples ();
Public:
Tdstretch ();
Virtual ~ Tdstretch ();
/// Operator 'new' is overloaded so that it automatically creates a suitable instance
/// Depending on if we 've a MMX/SSE/etc-capable CPU available or not.
Static void * operator new (size_t S );
/// Use this function instead of "new" operator to create a new instance of this class.
/// This function automatically chooses a correct feature set depending on if the CPU
/// Supports MMX/SSE/etc extensions.
Static tdstretch * newinstance ();
/// Returns the output buffer object
Export osamplepipe * getoutput () {return & outputbuffer ;};
/// Returns the input buffer object
Export osamplepipe * getinput () {return & inputbuffer ;};
/// Sets new target tempo. normal tempo = 'Scale', smaller values represent slower
/// Tempo, larger faster tempo.
Void settempo (float newtempo );
/// Returns Nonzero if There Aren't any samples available for outputting.
Virtual void clear ();
/// Clears the input buffer
Void clearinput ();
/// Sets the number of channels, 1 = mono, 2 = stereo
Void setchannels (INT numchannels );
/// Enables/disables the quick position seeking algorithm. Zero to disable,
/// Nonzero to enable
Void enablequickseek (bool enable );
/// Returns Nonzero if the quick seeking algorithm is enabled.
Bool isquickseekenabled () const;
/// Sets routine control parameters. These control are certain time constants
/// Defining how the sound is stretched to the desired duration.
//
/// 'Samplerate' = sample rate of the sound
/// 'Sequencems '= one processing sequence length in milliseconds
/// 'Seekwindowms' = seeking Window Length for scanning the best overlapping
/// Position
/// 'Overlapms' = Overlapping length
Void setparameters (INT samplerate, // <samplerate of sound being processed (HZ)
Int sequencems =-1, // <single processing sequence length (MS)
Int seekwindowms =-1, // <OFFSET seeking Window Length (MS)
Int overlapms =-1 // <sequence Overlapping length (MS)
);
/// Get routine control parameters, see setparameters () function.
/// Any of the parameters to this function can be null, in such case corresponding parameter
/// Value isn' t returned.
Void getparameters (int * psamplerate, int * psequencems, int * pseekwindowms, int * poverlapms) const;
/// Adds 'numsamples' pcs of samples from the 'samples' memory position
/// The input of the object.
Virtual void putsamples (
Const sampletype * samples, // <input sample data
Uint numsamples // <number of samples in 'samples' so that one sample
/// <Contains both channels if stereo
);
};
Derivative relationship between tdstretch class and base class
Export osamplepipe-> export oprocessor-> tdstretch
Let's first look at his constructor.
Tdstretch: tdstretch (): writable oprocessor (& outputbuffer)
{
Bquickseek = false;
Channels = 2;
Pmidbuffer = NULL;
Prefmidbufferunaligned = NULL;
Overlaplength = 0;
Bautoseqsetting = true;
Bautoseeksetting = true;
// Outdebt = 0;
Skipfract = 0;
Tempo = 1.0f;
Setparameters (44100, default_sequence_ms, default_seekwindow_ms, default_overlap_ms );
Settempo (1.0f );
Clear ();
}
Initialize some parameters.
Let's take a look at the class member function setparameters () implemented in source code tdstretch. cpp ()
// Sets routine control parameters. These control are certain time constants
// Defining how the sound is stretched to the desired duration.
//
// 'Samplerate' = sample rate of the sound
// 'Sequencems '= one processing sequence length in milliseconds (default = 82 MS)
// 'Seekwindowms' = seeking Window Length for scanning the best overlapping
// Position (default = 28 MS)
// 'Overlapms' = Overlapping length (default = 12 MS)
Void tdstretch: setparameters (INT asamplerate, int asequencems,
Int aseekwindowms, int aoverlapms)
{
// Accept only positive parameter values-If zero or negative, use old values instead
If (asamplerate> 0) This-> samplerate = asamplerate;
If (aoverlapms> 0) This-> overlapms = aoverlapms;
If (asequencems> 0)
{
This-> sequencems = asequencems;
Bautoseqsetting = false;
}
Else if (asequencems = 0)
{
// If zero, use automatic setting
Bautoseqsetting = true;
}
If (aseekwindowms> 0)
{
This-> seekwindowms = aseekwindowms;
Bautoseeksetting = false;
}
Else if (aseekwindowms = 0)
{
// If zero, use automatic setting
Bautoseeksetting = true;
}
Calcseqparameters ();
Calculateoverlaplength (overlapms );
// Set tempo to recalculate 'samplereq'
Settempo (TEMPO );
}
The calculation of the main parameters is completed by the following three class member functions:
Calcseqparameters ();
Calculateoverlaplength (overlapms); // set tempo to calculate 'samplereq'
Settempo (TEMPO );
Through the implementation of class member functions in the code, we can know that calcseqparameters () is used to calculate seekwindowlength and seeklength, both through a simple conversion formula length = (samplerate * sequencems)/1000; number of samples in milliseconds.
/// Calculates processing sequence length according to tempo setting
Void tdstretch: calcseqparameters ()
{
// Adjust tempo Param according to tempo, so that variating processing sequence length is used
// At varius tempo settings, between the given low... top limits
# Define autoseq_tempo_low 0.5 // auto setting low tempo range (-50%)
# Define autoseq_tempo_top 2.0 // auto setting top tempo range (+ 100%)
// Sequence-MS setting values at above low & Top tempo
# Define autoseq_at_min 125.0
# Define autoseq_at_max 50.0
# Define autoseq_k (autoseq_at_max-autoseq_at_min)/(autoseq_tempo_top-autoseq_tempo_low ))
# Define autoseq_c (autoseq_at_min-(autoseq_k) * (autoseq_tempo_low ))
// Seek-window-MS setting values at above low & Top tempo
# Define autoseek_at_min 25.0
# Define autoseek_at_max 15.0
# Define autoseek_k (autoseek_at_max-autoseek_at_min)/(autoseq_tempo_top-autoseq_tempo_low ))
# Define autoseek_c (autoseek_at_min-(autoseek_k) * (autoseq_tempo_low ))
# Define check_limits (x, mi, Ma) (x) <(MI ))? (MI): (x)> (MA ))? (Ma): (X )))
Double seq, seek;
If (bautoseqsetting)
{
SEQ = autoseq_c + autoseq_k * tempo;
SEQ = check_limits (SEQ, autoseq_at_max, autoseq_at_min );
Sequencems = (INT) (SEQ + 0.5 );
}
If (bautoseeksetting)
{
Seek = autoseek_c + autoseek_k * tempo;
Seek = check_limits (seek, autoseek_at_max, autoseek_at_min );
Seekwindowms = (INT) (seek + 0.5 );
}
// Update seek window lengths
Seekwindowlength = (samplerate * sequencems)/1000;
If (seekwindowlength <2 * overlaplength)
{
Seekwindowlength = 2 * overlaplength;
}
Seeklength = (samplerate * seekwindowms)/1000;
}
The class member function calculateoverlaplength () calculates the overlapped length,
/// Calculates overlapinmsec period length in samples.
Void tdstretch: calculateoverlaplength (INT overlapinmsec)
{
Int newovl;
Assert (overlapinmsec> = 0 );
Newovl = (samplerate * overlapinmsec)/1000;
If (newovl <16) newovl = 16;
// Must be divisible by 8
Newovl-= newovl % 8;
Acceptnewoverlaplength (newovl );
}
The class member function acceptnewoverlaplength () allocates the memory space to be occupied by the overlapping part.
/// Set new overlap length parameter & reallocate refmidbuffer if necessary.
Void tdstretch: acceptnewoverlaplength (INT newoverlaplength)
{
Int prevovl;
Assert (newoverlaplength> = 0 );
Prevovl = overlaplength;
Overlaplength = newoverlaplength;
If (overlaplength> prevovl)
{
Delete [] pmidbuffer;
Delete [] prefmidbufferunaligned;
Pmidbuffer = new sampletype [overlaplength * 2];
Clearmidbuffer ();
Prefmidbufferunaligned = new sampletype [2 * overlaplength + 16/sizeof (sampletype)];
// Ensure that 'prefmidbuffer 'is aligned to 16 byte boundary for efficiency
Prefmidbuffer = (sampletype *) (ulong) prefmidbufferunaligned) + 15) & (ulong)-16 );
}
}
The class member function settempo () resets the Audio Scaling.
// Sets new target tempo. normal tempo = 'Scale', smaller values represent slower
// Tempo, larger faster tempo.
Void tdstretch: settempo (float newtempo)
{
Int intskip;
Tempo = newtempo;
// Calculate new sequence duration
Calcseqparameters ();
// Calculate ideal skip length (according to tempo value)
Nominalskip = tempo * (seekwindowlength-overlaplength );
Intskip = (INT) (nominalskip + 0.5f );
// Calculate how many samples are needed in the 'inputbuffer'
// Process another batch of samples
// Samplereq = max (intskip + overlaplength, seekwindowlength) + seeklength/2;
Samplereq = max (intskip + overlaplength, seekwindowlength) + seeklength;
}
First, write down the parameters used by stretch. Now let's take a look at the actual physical meanings of these parameters.
The Sola algorithm is generally used for Audio Scaling. As shown in:
The algorithm is roughly as follows:
Take data of a certain size from the beginning of the original sound data. Assume seven samples are taken and put in a new buffer, as shown in, then nine samples are taken from the original data and the subsequent data, which is superimposed with the previous seven samples. The overlay range is assumed to be 2, then (7-2) /9 = 0.555, which means the sound duration is reduced by about 44.5% compared with the original one. At the same time, it is noted that the time interval (sampling frequency) has not changed, that is to say, the frequency (tone) of the voice has not changed. As to why we need to overlay a part of it, it is to suppress the noise or over-natural sound caused by data loss caused by discontinuous sound extraction signals. This figure compares the three class member functions tdstretch to understand the definition of function initialization. At the same time, the process of changing the adjustment is clearer. It is consistent with the putsamples condition judgment in the member function of the soundtouch class. It is simply a problem of first scaling, then re-sampling, or re-sampling and then scaling.
The specific process of Sola is clearly expressed by the member function processsamples of the tdstretch class. First, copy a sequence to the beginning, find the best superposition position, and compare it by calculating the number of normalized interrelationships, the main implementation is to use the class member function seekbestoverlapposition (const sampletype * refpos) to determine whether it is a single channel or a dual channel, and call different tdstretch: seekbestoverlappositionxxxx (const sampletype * refpos ); there are two floating-point and fixed-point versions, and the single-channel floating-point version is also used as an example:
Int tdstretch: seekbestoverlappositionmono (const sampletype * refpos)
{
Int bestoffs;
Double bestcorr, Corr;
Int tempoffset;
Const sampletype * compare;
// Slopes the amplles of the 'midbuffer' Samples
Precalccorrreferencemono ();
Bestcorr = flt_min;
Bestoffs = 0;
// Scans for the best correlation value by testing each possible position
// Over the permitted range.
For (tempoffset = 0; tempoffset <seeklength; tempoffset ++)
{
Compare = refpos + tempoffset;
// Calculates correlation value for the mixing position corresponding
// To 'tempoffset'
Corr = (double) calccrosw.mono (prefmidbuffer, compare );
// Heuristic rule to slightly favor values close to mid of the range
Double TMP = (double) (2 * tempoffset-seeklength)/seeklength;
Corr = (Corr + 0.1) * (1.0-0.25 * TMP ));
// Checks for the highest correlation value
If (Corr> bestcorr)
{
Bestcorr = Corr;
Bestoffs = tempoffset;
}
}
// Clear cross correlation routine state if necessary (is so e.g. In MMX routines ).
Clearcrossnation state ();
Return bestoffs;
}
The class member function seekbestoverlappositionmono calls the class member function calccrosso Mono ()
Double tdstretch: calccrosw.mono (const float * mixingpos, const float * compare) const
{
Double Corr;
Double norm;
Int I;
Corr = norm = 0;
For (I = 1; I <overlaplength; I ++)
{
Corr + = mixingpos [I] * compare [I];
Norm + = mixingpos [I] * mixingpos [I];
}
If (norm <1e-9) norm = 1.0; // to avoid Div by zero
Return Corr/SQRT (Norm );
}
The formula corr = x (n) * H (-N) and * Is convolution. It is a little different from our form. Next time, analyze it slowly. Finally, copy the next sequence to the superposition position. The amplitude of the superposition part is calculated by the tdstretch member function overlap. The specific code is as follows, you can call a single-or dual-channel class member function to process the audio streams. Taking single channel as an example, we mainly consider better understanding. In fact, there are similar two voices. That is, pay attention to processing the incremental data of the Data loop, and processing the data of one more right or left channel each time in the loop processing.
// Overlaps samples in 'midbuffer' with the samples in 'pinputbuffer' at position
// Of 'ovlpos '.
Inline void tdstretch: overlap (sampletype * poutput, const sampletype * pinput, uint ovlpos) const
{
If (channels = 2)
{
// Stereo sound
Overlapstereo (poutput, pinput + 2 * ovlpos );
} Else {
// Mono sound.
Overlapmono (poutput, pinput + ovlpos );
}
}
The implementation of the class member function overlapmono is as follows:
// Overlaps samples in 'midbuffer' with the samples in 'pinput'
Void tdstretch: overlapmono (sampletype * poutput, const sampletype * pinput) const
{
Int I, itemp;
For (I = 0; I <overlaplength; I ++)
{
Itemp = overlaplength-I;
Poutput [I] = (pinput [I] * I + pmidbuffer [I] * itemp)/overlaplength; //> overlapdividerbits;
}
}
Pmidbuffer overlaps with pinput. The Overlapping length is overlaplength.
Note that the core algorithm is just a line of code poutput [I] = (pinput [I] * I + pmidbuffer [I] * itemp)/overlaplength; set a = I; B = itemp; k = overlaplength; X = pinput [I], y = pmidbuffer [I], Z = poutput [I] replace this line of code with the following two lines of pseudocode: X, Y is used as two inputs of the system, and Z is used as the output.
A + B = K;
Ax + by = KZ;
I'm familiar with it, but I can't tell you what it is for a while. Write it down now. I will understand what this algorithm is called later.
This article from the csdn blog, reproduced please indicate the source: http://blog.csdn.net/suhetao/archive/2010/09/04/5863477.aspx