TimeStretch

From Avisynth wiki
Jump to: navigation, search

AviSynth+
Up-to-date documentation: https://avisynthplus.readthedocs.io


TimeStretch allows changing the sound tempo, pitch and playback rate parameters independently from each other, i.e.:

tempo adjusts speed while maintaining the original pitch.
pitch adjusts speed while maintaining the original tempo.
rate adjusts playback rate that affects both tempo and pitch at the same time.

You can use these parameters in any combination – for example, 104% tempo with 95% pitch.

Contents

[edit] Syntax and Parameters

TimeStretch(clip clip [, float tempo, float rate, float pitch,
      int sequence, int seekwindow, int overlap, bool quickseek, int aa ])

TimeStretch(clip clip [, int tempo_n, int tempo_d, int rate_n, int rate_d, int pitch_n, int pitch_d,
      int sequence, int seekwindow, int overlap, bool quickseek, int aa ])

clip  clip =
Source clip. Audio is always converted to Float.
AVS+ no conversion is performed. Accepts Float audio only.
Starting from v2.61 multichannel audio is supported.
Prior to v2.61: If clip.AudioChannels=2, special processing is used to preserve stereo imaging. Otherwise, channels are processed independently. Independent processing works well for unrelated audio tracks, but not very well for surround sound; pending the release of AviSynth v2.61, try TimeStretchPlugin instead. [1] [2]
[edit] Tempo, Rate and Pitch
float  tempo = 100.0
Changes speed while maintaining the original pitch.
If tempo=200, the audio will play twice (200%) as fast; if tempo=50, the audio will play half (50%) as fast.
The effect is also known as time-stretching.
float  rate = 100.0
Changes speed while allowing pitch to rise or fall, like the traditional analog vari-speed effect.
If rate=200, the audio will play twice (200%) as fast; if rate=50, the audio will play half (50%) as fast.
Rate control is implemented purely by sample rate transposing.[3]
If rate is adjusted by itself, no time-stretching or pitch-shifting is performed, and the Advanced Parameters will have no effect.
float  pitch = 100.0
Changes pitch while maintaining the original speed (within a small tolerance–see Notes below).
If pitch=200, the audio will sound an octave higher; if pitch=50, the audio will sound an octave lower.
The effect is also known as pitch-shifting.
tempo, rate and pitch can all be adjusted independently, in which case their effects are added together.
[edit] Tempo_n, Tempo_d, Rate_n, Rate_d, Pitch_n and Pitch_d
When needing more accuracy you can use the rational pair parameters tempo_n, tempo_d, rate_n, rate_d, pitch_n and pitch_d instead. All parameters are integers and have default value 1. Internally tempo is calculated as double(tempo_n/tempo_d) (rate and pitch likewise), before further processing. Seeking should be sample exact.
[edit] Advanced Parameters
The time-stretch algorithm has a few parameters that can be tuned to optimize sound quality for certain applications. The current default parameters have been chosen by iterative if-then analysis (read: "trial and error") to obtain the best subjective sound quality in pop/rock music processing, but in applications processing different kind of sound the default parameter set may return a sub-optimal result.
These parameters affect the time-stretch algorithm as follows:
int  sequence = 100 *
This is the length of a single processing sequence in milliseconds, which determines how the original sound is chopped in the time-stretch algorithm. Larger values mean fewer, and longer, sequences are used. In general,
  • a larger sequence value sounds better with a lower tempo and/or pitch;
  • a smaller sequence value sounds better with a higher tempo and/or pitch.
int  seekwindow = 22 *
The length in milliseconds for the algorithm that searches for the best possible overlap location. For larger seekwindow values, the possibility of finding a better mixing position increases, but an overly large seekwindow may cause drifting (a disturbing artifact where audio pitch seems unsteady) because neighboring sequences may be chosen at more uneven intervals.
int  overlap = 8 *
The overlap length in milliseconds. When the sound sequences are mixed back together to form a continuous sound stream again, overlap defines how much of the ends of the consecutive sequences will be overlapped. This shouldn't be a critical parameter. If you reduce the sequence by a large amount, you might wish to try a smaller overlap.
bool  quickseek = false *
The time-stretch routine has a 'quick' mode that substantially speeds up the algorithm but may degrade the sound quality when quickseek is set to true.
  • Try quickseek=false if you hear artifacts like warbling, clicking etc.
int  aa = 64 *
Controls the number of taps the anti-alias filter uses. Set to 0 to disable the filter. Must be a multiple of 4.
This table summarizes how these parameters can be adjusted for different applications:
Parameter Default value If larger... If smaller... Music Speech CPU burden
sequence Relatively large, chosen for slowing music tempo. Usually better for slowing tempo. You might need less overlap. Accelerates "echoing" artifact when slowing down the tempo. Default value usually good. A smaller value might be better. Smaller value increases CPU burden.
seekwindow  Relatively large, chosen for slowing music tempo. Eases finding a good mixing position, but may cause "drifting" artifact. Makes finding a good mixing position more difficult. Default usually good, unless "drifting" is a problem. Default usually good. Larger value increases CPU burden.
overlap Relatively large, chosen to suit above parameters. Larger value increases CPU burden.
* sequence and seekwindow have default values 100 and 22. However they are updated if the calculated tempo is different from the default value (100). The calculated tempo depends on the specified tempo or pitch in your script. It will be different from 100 if tempo or pitch in your script is different from 100. The update of the default values happens in TDStretch::calcSeqParameters().

[edit] Notes

  • Since tempo, rate and pitch are floating-point values, but sample rates are integers, rounding effects in calculations are unavoidable; the resulting audio track duration may be off by up to several 10's of milliseconds (less than one video frame) per hour.

Pitch is also rounded for the same reason, but the amount is so small that the effect is inaudible: according to Wikipedia, the just-noticeable pitch difference is 0.1%–0.6%, while the rounding error is about 0.002%.

[edit] Examples

  • Raise pitch one octave, without changing speed:
TimeStretch(pitch=200.0)
# TimeStretch(pitch_n=2, pitch_d=1) # more accurate processing
  • Raise pitch one semi-tone, without changing speed:
delta_pitch=1.0 ## (semitones)
TimeStretch(pitch=100.0*pow(2.0, delta_pitch/12.0)) 
  • Raise playback tempo from NTSC Film speed (23.97 fps) to PAL speed (25 fps) without changing pitch:
TimeStretch(tempo=100.0*25.0/(24000.0/1001.0))
  • Increase speed to 105%, allowing pitch to rise.
TimeStretch(rate=105)

...which is equivalent to:

ar=AudioRate 
AssumeSampleRate(Round(ar*1.05))
ResampleAudio(ar)

[edit] Credits

TimeStretch uses the SoundTouch Audio Processing Library

Copyright © Olli Parviainen
SoundTouch home page: surina.net/soundtouch

[edit] Changelog

v2.61 Updated SoundTouch library to 1.9.2. Fixes multichannel issues.
Add TimeStretch overload with rational pair arguments.
v2.57 Expose soundtouch parameters
v2.55 Initial Release (based on SoundTouch library 1.4.0?)
Personal tools