MT

From Avisynth wiki
Jump to: navigation, search
Abstract
Author tsp
Version 0.7
Download MT 0.7 (includes modified Avisynth 2.5.7.5)
Category Meta-Filters
Requirements
  • None
License GPL v2
Discussion

Contents

Note: Not maintained anymore; use Avisynth 2.6 builds instead.

MT 0.7

Abstract

MT is a filter that enables other filters to be run multithreaded. This should hopefully speed up processing on hyperthreaded/multicore processors or multiprocessor systems.

Always remember to judge the result by looking at the speed improvement - not the CPU utilization.

Technical info

MT is a filter that splits a frame up into smaller fragments that are processed in individual threads, allowing full utilization of multiprocessor or hyper-thread enabled computers. I tested it on my old abit bp6 with 2x celeron 400 MHz and it increased the speed by 40%. Note that if you are already getting 100% CPU utilization when processing Avisynth scripts (eg if you're encoding to DivX/XviD) you don't need to use this filter.

The filter works like this Avisynth function:

function PseudoMT(clip c,string filter)
{
a=eval("c.crop(0,0,src.width/2,src.height)."+filter)
b=eval("c.crop(src.width/2,0,src.width/2,src.height)."+filter)
stackhorizontal(a,b)
}


The only difference is that a and b are executed in parallel and it is possible to split the frame into more than 2 pieces. If the filter works with the above script, it should work with MT if the filter is thread safe. Dust does not work with the above script, so if you want to use iiP, use another denoiser or get Steady to fix the bug.

Limitations

The filter to be run must accept only one input clip and that is taken from the special variable last. Also the filter should not rely on the content of the whole frame (like smart deinterlacers) else there is a risk that only part of the frame will be processed. The filter should also be thread safe. Most filters are thread safe but some will produce a wrong result or crash.

Installation

Copy mt.dll into the Avisynth plugin directory and copy the included avisynth.dll into your System32 directory (SysWOW64 on 64-bit Windows) or where avisynth.dll is located. Remember to back up the old avisynth.dll (rename it or something) if you don't have version 2.6 installed.

From version 0.7 two other filters are included too:

  • MTi creates two threads and lets each thread process one field, combining them like this Avisynth function:
function PseudoMTi(clip c,string filter)
{
a=eval("c.AssumeFieldBased().SeparateFields.selecteven()."+filter)
b=eval("c.AssumeFieldBased().SeparateFields.selectodd()."+filter)
interleave(a,b).weave()
}
As in the previous example, a and b are executed in parallel. Note that only two threads are created so it will only use two (virtual) cores.
  • MTsource is used to run source filters multithreaded. It works like this:
function PseudoMTsource(string filter)
{
SetMTmode(2)
eval(filter)
SetMtmode(0)
}
Unlike the two other filters, MTsource is a temporal filter that fetches frames ahead of time and stores them in the cache for fast retrieval.

Syntax

MT

MT(clip clip, string filter, int threads, int overlap, bool splitvertical)
clip  clip = last
input clip
string  filter = (no default)
filter to run multithreaded. Note that the filter must not change both the frame height and width (but colorspace is okay) and that only 1 input clip is allowed. It can be any built-in filter, Avisynth defined filter or external plugin filter as long as the restrictions are observed.
int  threads = 2
number of threads to run. Set this to the number of threads your computer is able to run concurrently.
int  overlap = 0
number of pixels to add at the top and bottom border or left and right border. Increase this if you see artifacts where the frame is split.
bool  splitvertical = false
if true the frames are cut vertically (and the filter is allowed to change the height) else it is cut horizontally (and the filter is allowed to change the width).

MTi

MTi(clip clip, string filter)
clip  clip = last
input clip. Must be mod2 height for RGB and YUY2 color-spaces and mod4 height for YV12 colorspace
string  filter = (no default)
filter to run multithreaded. Note that the filter is allowed to change both width and height at the same time but only 1 input clip is allowed. It can be any built-in filter, Avisynth defined filter or external plugin filter as long as the restrictions are observed.

MTsource

MTsource(string filter, int delta, int threads, int max_fetch)
string  filter = (no default)
source filter to run multithreaded. Currently only internal and external source filters are supported (like DirectShowSource, AviSource, MPEG2Source). You can use an Avisynth defined filter or a non-source filter but it might crash or produce frame corruption.
int  delta = 1
this is how many frames there are between each frame request, so if you are only going to read every second frame set it to 2 or if you are reading the frames backwards set it to -1.
More complex frame access patterns like SelectEvery(10,3,6,7) are not supported (but might work anyway as the requested frames are in the cache, there will just be some wasted memory from non requested frame in the cache).
int  threads = 2
number of threads to run. Set this to the number of threads your computer is able to run concurrently.
int  max_fetch = 30
This is the maximum number of frames ahead of the currently requested frame that MTsource will fetch. Setting it to low will leaving the threads idle for most of the time and setting it to high will waste too much memory.


Examples

Ordinary blur:

MT("blur(1)",2,2)

also user defined function (uses variableblur):

MT("unsharp(2,0.7)",2,2)

function unsharpen(clip c,float variance,float k)
{
blr=binomialBlur(c,vary=variance,varc=2,Y=3,U=2,V=2)
return yv12lutxy(blr,c,"y x - "+string(k)+" * y +",y=3,u=2,v=2)
}

This one will not produce the intended result but shows how to use the triple quotes:

MT(""" subtitle("Doh") """,4,0)

Example of MTi

MTi("fft3dfilter()")

produces nearly the same result as

MT("fft3dfilter(interlaced=true)",threads=2)

but for filters that don't natively support interlaced content, it can be easier to use MTi()

Example of MTsource()

ir=MTSource(""" imagereader("c:\test.png") """,delta=1,threads=2,max_fetch=10)
as=MTSource(""" avisource("c:\test.avi") """,delta=-1) #delta negative due to reverse()
ms=MTSource(""" MPEG2Source("c:\test.d2v") """,delta=9) #delta is 9 due to selectevery(9,1)
stackhorizontal(ir.trim(0,100),as.reverse().trim(0,100),ms.selectevery(9,1).trim(0,100))


Changes in MT 2.5.7.5

Abstract

MT now contains the new functions SetMTMode() and GetMTMode() and is needed by MT.dll. Install it by overwriting avisynth.dll in your System32 directory (SysWOW64 on 64-bit Windows). Remember to back up your current avisynth.dll before installing the new one.

Technical info

These functions enable Avisynth to use more than one thread when processing filters. This is useful if you have more than one CPU/core or hyper-threading. This feature is still experimental.

Syntax

GetMTMode

GetMTMode(bool threads)
bool  threads = false
if true GetMTMode returns the number of threads used else the current mode is returned (see below).

SetMTmode

SetMTmode(int mode,int threads)
Place this at the first line in the Avisynth file to enable temporal (that is more than one frame is processed at the same time) multithreading. Use it later in the script to change the mode for the filters below it.
int  mode = 2
there are 6 modes:
  • Mode 1 is the fastest but only works with a few filter
  • Mode 2 should work with most filters but uses more memory
  • Mode 3 should work with some of the filters that don't work with mode 2 but is slower
  • Mode 4 is a combination of mode 2 and 3 and should work with even more filter but is both slower and uses more memory
  • Mode 5 is slowest (slower than not using SetMTMode) but should work with all filters that don't require linear frameserving (that is, the frames come in order (frame 0,1,2 ... last)).
  • Mode 6 is a modified mode 5 that might be slightly faster
A more detailed explanation of the modes 1 and 2 can be read here: MT modes explained
int  threads = 0
number of threads to use. Set to 0 to set it to the number of processors available. It is not possible to change the number of threads other than in the first SetMTMode.

Example

SetMTMode(2,0) #enables multihreading using thread = to the number of available processors and mode 2
LoadPlugin("...\LoadPluginEX.dll") #needed to load Avisynth 2.0 plugins
LoadPlugin("...\DustV5.dll") #Loads Pixiedust
import("limitedsharpen.avs")
src=AVIsource("test.avi")
SetMTMode(5) #change the mode to 5 for the lines below
src=src.converttoyuy2().PixieDust()#Pixiedust needs mode 5 to function.
SetMTMode(2) #change the mode back to 2
src.LimitedSharpen() #because LimitedSharpen works well with mode 2
subtitle("Number of threads used: "+string(GetMTMode(true))+" Current MT Mode: "+string(GetMTMode())) #display mode and number of threads in use

How to develop thread-safe filters

Filter construction and destruction is single threaded. Only calls to GetFrame are multithreaded. No linear frame order is assured (unless MT is used instead of SetMTMode) and for each mode there are different restrictions:

  • Mode 1: all access to class variables, global variables and static variables must be thread-safe by using appropriate locking (Enter/LeaveCriticalSection etc, no locking needed for read-only variables) because more than 1 thread may access a class instance at a time.
  • Mode 2: access to class variable doesn't have to be thread-safe because there is only 1 instance of the class per thread. All global/static variable access must be thread-safe. Because each class instance only processes every other frame an internal cache (that is a cache inside the filter) won't work well. I have created PClipLocalStorage to share a pointer between different filter instances.
  • Mode 3: Only 1 thread is allowed to execute code from the filter at the same time. When child->GetFrame is called another thread can enter the filter and execute code. That means that class variables/global variables/static variables shouldn't be assigned to any values before the lastchild->GetFrame has been called. Instead local function variables should be used like this:
PVideoFrame __stdcall AdjustFocusV::GetFrame(int n, IScriptEnvironment* env) { 
PVideoFrame frame = child->GetFrame(n, env);
//Assigned to a local variable so this will work in mode 3
env->MakeWritable(&frame); if (!line) line = new uc[frame->GetRowSize()+32];
uc* linea = (uc*)(((int)line+15) & -16);// Align 16
uc* buf = frame->GetWritePtr();
int pitch = frame->GetPitch();
int row_size = vi.RowSize();
int height = vi.height; 
memcpy(linea, buf, row_size); // First row - map centre as upper 
if ((pitch >= ((row_size+7) & -8)) && (env->GetCPUFlags() & CPUF_MMX)) 
{ 
 AFV_MMX(linea, buf, height, pitch, row_size, amount); }
else
{ 
 AFV_C(linea, buf, height, pitch, row_size, amount); } 
return frame; 
}
But not like this:
PVideoFrame TemporalSoften::GetFrame(int n,IScriptEnvironment* env) {
__int64 i64_thresholds = 0x1000010000100001i64;
int radius = (kernel-1) / 2 ;
int c= 0;
// Just skip if silly settings
if((!luma_threshold)&& (!chroma_threshold) || (!radius))
  return child->GetFrame(n,env); 
for(int p= 0;p<16;p++) planeDisabled[p]=false;
 for(p= n-radius;p<=n+radius;p++) 
  { 
   frames[p+radius-n] = child->GetFrame(min(vi.num_frames-1,max(p,0)), env);
   //GetFrame assigned to class variable frames. This wouldn't work with Mode 3 
   //because the next thread that enters this getframe will overwrite the result 
   // from the last thread } 
  //do stuff
  }
when using mode 3 there is no need for thread-safe access to class variables. Because there is only 1 instance of the class that processes all frames, internal caches will work much better. The bad thing is only 1 thread can execute the filter at a time, so if it's the only slow filter in the script the speed increase won't be that big.
  • Mode 4: a combination of mode 2 and 3, so it's okay to assign class variables before the last child->GetFrame has been called because there is a class instance per thread, but the problem with internal caches is the same as mode 2
  • Mode 5: No restrictions.
  • Mode 6: A slightly modified version of mode 5 that might be a little faster.

PClipLocalStorage

Here is an example on how the PClipLocalStorage can be used to share a cache between multiple instances (that are created with mode=2,4):

class Cache
{
public: //These function should be threadsafe. The most simple way is to use 
 a //critical section like
 this PVideoFrame GetCachedFrame(int
 framenumber)
 {
  EnterCriticalSection(&cs);
  //Code
   
  //...
  LeaveCriticalSection(&cs); 
  return retval;
 } SetCachedFrame(PVideoFrame
frame);
private: 
 CRITICAL_SECTION cs;
} 

class Sample : public GenericVideoFilter{
public:
 Sample(PClip _child, IScriptEnvironment* env);
 ~Sample();
 PVideoFrame __stdcall GetFrame(int n, IScriptEnvironment* env);
protected: 
 PClipLocalStorage cls;
 Cache* FrameCache;
}

Sample::Sample(PClip _child, IScriptEnvironment* env)
:
GenericVideoFilter(_child),cls(env)
{ //if the cache has not been created yet GetValue will return 0
 if(cls->GetValue()==0) {
 //create the cache and save the address in the PClipLocalStorage
 FrameCache =  new
 Cache();cls->SetValue(static_cast(FrameCache));
 }
 // The cache has been created so assign the address to FrameCache
 else  {
 FrameCache=static_cast(cls->GetValue());
 }  
}

Sample::~Sample()
{
//only delete FrameCache if it is not delete yet.
if(cls->GetValue()!=0)  {
 delete FrameCache;
 cls->SetValue(0);//Signal that the cache is deleted
 }
}

Changelog

  • 0.1 - first release.
  • 0.2 - Should be more thread safe.
  • 0.21 - forgot to comment out a Sleep(0)
  • 0.25 - Added the splitvertical option
  • 0.3 - More stable(and slower)
  • 0.4 - Includes a custom version of Avisynth 2.56 beta that should speed things up
  • 0.41 - Minor speed increase
  • 0.5 - Requires the included modified Avisynth 2.5.6 or Avisynth 2.6
  • 0.6 - Bugfix: height can be changed with splitvertical=true without crashing.
Also includes modified Avisynth MT 2.5.7.3
  • 0.7 - two new filters: MTi, MTsource and Avisynth MT 2.5.7.5
  • Avisynth MT 2.5.7.3 - two new functions SetMTMode, GetMTMode

Links

  • Please read this page and the support page below before asking for help.
Personal tools