TimeStretch

From Avisynth wiki
(Difference between revisions)
Jump to: navigation, search
(Changelog)
Line 170: Line 170:
 
| v2.61
 
| v2.61
 
| Updated SoundTouch library to 1.9.2. Fixes multichannel issues.
 
| Updated SoundTouch library to 1.9.2. Fixes multichannel issues.
 +
|-
 +
|
 
| Add TimeStretch overload with rational pair arguments.
 
| Add TimeStretch overload with rational pair arguments.
 
|-
 
|-

Revision as of 23:40, 21 June 2016

Change the audio speed and/or pitch:

tempo adjusts speed while maintaining pitch.
rate adjusts speed while allowing pitch to rise or fall.
pitch adjusts pitch while maintaining speed.

You can use these parameters in any combination – for example, 104% tempo with 95% pitch.

Contents

Syntax and Parameters

TimeStretch(clip clip [, float tempo, float rate, float pitch,
      int sequence, int seekwindow, int overlap, bool quickseek, int aa ])

TimeStretch(clip clip [, int tempo_n, int tempo_d, int rate_n, int rate_d, int pitch_n, int pitch_d,
      int sequence, int seekwindow, int overlap, bool quickseek, int aa ])

clip  clip =
Source clip. Incoming audio sample type is automatically converted to Float.
If clip.AudioChannels=2, special processing is used to preserve stereo imaging. Otherwise, channels are processed independently. Independent processing works well for unrelated audio tracks, but not very well for surround sound; pending the release of AviSynth v2.61, try TimeStretchPlugin instead. [1] [2]
Tempo, Rate and Pitch
float  tempo = 100.0
Changes speed while maintaining the original pitch.
If tempo=200, the audio will play twice (200%) as fast; if tempo=50, the audio will play half (50%) as fast.
The effect is also known as time-stretching.
float  rate = 100.0
Changes speed while allowing pitch to rise or fall, like the traditional analog vari-speed effect.
If rate=200, the audio will play twice (200%) as fast; if rate=50, the audio will play half (50%) as fast.
Rate control is implemented purely by sample rate transposing.[3]
If rate is adjusted by itself, no time-stretching or pitch-shifting is performed, and the Advanced Parameters will have no effect.
float  pitch = 100.0
Changes pitch while maintaining the original speed (within a small tolerance–see Notes below).
If pitch=200, the audio will sound an octave higher; if pitch=50, the audio will sound an octave lower.
The effect is also known as pitch-shifting.
Tempo, rate and pitch can all be adjusted independently, in which case their effects are added together.
Tempo_n, Tempo_d, Rate_n, Rate_d, Pitch_n and Pitch_d
When needing more accuracy you can use the rational pair parameters Tempo_n, Tempo_d, Rate_n, Rate_d, Pitch_n and Pitch_d instead. All parameters are integers and have default value 1. Internally Tempo is calculated as double(Tempo_n/Tempo_d) (Rate and Pitch likewise), before further processing. Seeking should be sample exact.
Advanced Parameters
The time-stretch algorithm has a few parameters that can be tuned to optimize sound quality for certain applications. The current default parameters have been chosen by iterative if-then analysis (read: "trial and error") to obtain the best subjective sound quality in pop/rock music processing, but in applications processing different kind of sound the default parameter set may return a sub-optimal result.
int  sequence = 82 *
This is the length of a single processing sequence in milliseconds, which determines how the original sound is chopped in the time-stretch algorithm. Larger values mean fewer, and longer, sequences are used. In general,
  • a larger sequence value sounds better with a lower tempo and/or pitch;
  • a smaller sequence value sounds better with a higher tempo and/or pitch.
int  seekwindow = 28 *
The length in milliseconds for the algorithm that searches for the best possible overlap location. For larger seekwindow values, the possibility of finding a better mixing position increases, but an overly large seekwindow may cause drifting (a disturbing artifact where audio pitch seems unsteady) because neighboring sequences may be chosen at more uneven intervals.
int  overlap = 12 *
The overlap length in milliseconds. When the sound sequences are mixed back together to form a continuous sound stream again, overlap defines how much of the ends of the consecutive sequences will be overlapped. This shouldn't be a critical parameter. If you reduce the sequence by a large amount, you might wish to try a smaller overlap.
bool  quickseek = true? *
The time-stretch routine has a 'quick' mode that substantially speeds up the algorithm but may degrade the sound quality.
  • Try quickseek=false if you hear artifacts like warbling, clicking etc.
int  aa = ? *
Controls the number of taps the anti-alias filter uses. Set to 0 to disable the filter. Must be a multiple of 4.
This table summarizes how these parameters can be adjusted for different applications:
Parameter Default value If larger... If smaller... Music Speech CPU burden
sequence Relatively large, chosen for slowing music tempo. Usually better for slowing tempo. You might need less overlap. Accelerates "echoing" artifact when slowing down the tempo. Default value usually good. A smaller value might be better. Smaller value increases CPU burden.
seekwindow  Relatively large, chosen for slowing music tempo. Eases finding a good mixing position, but may cause "drifting" artifact. Makes finding a good mixing position more difficult. Default usually good, unless "drifting" is a problem. Default usually good. Larger value increases CPU burden.
overlap Relatively large, chosen to suit above parameters. Larger value increases CPU burden.
* TODO ( default values are under investigation at the time of updating this documentation )


Notes

  • Since tempo, rate and pitch are floating-point values, but sample rates are integers, rounding effects in calculations are unavoidable; the resulting audio track duration may be off by up to several 10's of milliseconds (less than one video frame) per hour.

Pitch is also rounded for the same reason, but the amount is so small that the effect is inaudible: according to Wikipedia, the just-noticeable pitch difference is 0.1%–0.6%, while the rounding error is about 0.002%.

Examples

  • Raise pitch one octave, without changing speed:
TimeStretch(pitch=200) 
  • Raise pitch one semi-tone, without changing speed:
delta_pitch=1.0 ## (semitones)
TimeStretch(pitch=100.0*pow(2.0, delta_pitch/12.0)) 
  • Raise playback tempo from NTSC Film speed (23.97 fps) to PAL speed (25 fps) without changing pitch:
TimeStretch(tempo=100.0*25.0/(24000.0/1001.0))
  • Increase speed to 105%, allowing pitch to rise.
TimeStretch(rate=105)

...which is equivalent to:

ar=AudioRate 
AssumeSampleRate(Round(ar*1.05))
ResampleAudio(ar)

Credits

TimeStretch uses the SoundTouch Audio Processing Library

Copyright © Olli Parviainen
SoundTouch home page: surina.net/soundtouch

Changelog

v2.61 Updated SoundTouch library to 1.9.2. Fixes multichannel issues.
Add TimeStretch overload with rational pair arguments.
v2.57 Expose soundtouch parameters
v2.55 Initial Release (based on SoundTouch library 1.4.0?)
Personal tools