MT
Raffriff42 (Talk | contribs) m (typo) |
Raffriff42 (Talk | contribs) m (link fix) |
||
(3 intermediate revisions by one user not shown) | |||
Line 1: | Line 1: | ||
{{FilterCat|Plugins|External_filters|Other_filters|Meta_filters}} | {{FilterCat|Plugins|External_filters|Other_filters|Meta_filters}} | ||
− | {{Filter|tsp|0.7|[http://www.avisynth.org/tsp/MT_07.zip MT 0.7 (includes modified | + | {{Filter|tsp|0.7|[http://www.avisynth.org/tsp/MT_07.zip MT 0.7 (includes modified Avisynth 2.5.7.5)]|Meta-Filters| |
* None | * None | ||
|GPL v2}} | |GPL v2}} | ||
− | + | __TOC__ | |
+ | '''''Note:''''' Not maintained anymore; use Avisynth [[Main_Page#Download_AviSynth|2.6 builds]] instead. | ||
== MT 0.7 == | == MT 0.7 == | ||
− | |||
− | |||
=== Abstract === | === Abstract === | ||
− | MT is a filter that enables other filters to be run multithreaded. This should hopefully speed up processing on hyperthreaded/multicore processors or multiprocessor systems. | + | '''MT''' is a filter that enables other filters to be run multithreaded. This should hopefully speed up processing on hyperthreaded/multicore processors or multiprocessor systems. |
− | + | ||
− | + | ||
− | '' | + | '''''Always remember to judge the result by looking at the speed improvement - not the CPU utilization.''''' |
=== Technical info === | === Technical info === | ||
− | MT is a filter that splits a frame up into smaller fragments that are processed in individual threads, allowing full utilization of multiprocessor or | + | MT is a filter that splits a frame up into smaller fragments that are processed in individual threads, allowing full utilization of multiprocessor or hyper-thread enabled computers. I tested it on my old abit bp6 with 2x celeron 400 MHz and it increased the speed by 40%. Note that if you are already getting 100% CPU utilization when processing Avisynth scripts (eg if you're encoding to DivX/XviD) you don't need to use this filter. |
− | The filter works like this | + | The filter works like this Avisynth function: |
function PseudoMT(clip c,string filter) | function PseudoMT(clip c,string filter) | ||
Line 29: | Line 26: | ||
− | The only difference is that a and b are executed in parallel and it is possible to split the frame into more than 2 pieces. If the filter works with the above script, it should work with MT if the filter is thread safe. Dust does not work with the above script, so if you want to use | + | The only difference is that a and b are executed in parallel and it is possible to split the frame into more than 2 pieces. If the filter works with the above script, it should work with MT if the filter is thread safe. [[Dust]] does not work with the above script, so if you want to use [[External_plugins_old#Others|iiP]], use another denoiser or get Steady to fix the bug. |
=== Limitations === | === Limitations === | ||
Line 37: | Line 34: | ||
=== Installation === | === Installation === | ||
− | + | Copy mt.dll into the Avisynth plugin directory and copy the included avisynth.dll into your System32 directory (SysWOW64 on 64-bit Windows) or where avisynth.dll is located. Remember to back up the old avisynth.dll (rename it or something) if you don't have version 2.6 installed. | |
− | + | From version 0.7 two other filters are included too: | |
− | * | + | *'''MTi''' creates two threads and lets each thread process one field, combining them like this Avisynth function: |
function PseudoMTi(clip c,string filter) | function PseudoMTi(clip c,string filter) | ||
Line 50: | Line 47: | ||
} | } | ||
− | + | :As in the previous example, a and b are executed in parallel. Note that only two threads are created so it will only use two (virtual) cores. | |
− | * | + | *'''MTsource''' is used to run source filters multithreaded. It works like this: |
function PseudoMTsource(string filter) | function PseudoMTsource(string filter) | ||
Line 61: | Line 58: | ||
} | } | ||
− | + | :Unlike the two other filters, ''MTsource'' is a temporal filter that fetches frames ahead of time and stores them in the cache for fast retrieval. | |
=== Syntax === | === Syntax === | ||
==== MT ==== | ==== MT ==== | ||
− | {{FuncDef|MT(clip ''clip'',string ''filter'',int ''threads'',int ''overlap'',bool ''splitvertical'')}} | + | :{{FuncDef|MT(clip ''clip'', string ''filter'', int ''threads'', int ''overlap'', bool ''splitvertical'')}} |
− | {{Par2|clip|clip|last}} | + | :{{Par2|clip|clip|last}} |
− | :input clip | + | ::input clip |
− | {{Par2|filter|string|(no default)}} | + | :{{Par2|filter|string|(no default)}} |
− | :filter to run multithreaded. Note that the filter must not change both the frame height and width (but colorspace is okay) and that only 1 input clip is allowed. It can be any built-in filter, | + | ::filter to run multithreaded. Note that the filter must not change both the frame height and width (but colorspace is okay) and that only 1 input clip is allowed. It can be any built-in filter, Avisynth defined filter or external plugin filter as long as the restrictions are observed. |
− | {{Par2|threads|int|2}} | + | :{{Par2|threads|int|2}} |
− | :number of threads to run. Set this to the number of threads your computer is able to run concurrently. | + | ::number of threads to run. Set this to the number of threads your computer is able to run concurrently. |
− | {{Par2|overlap|int|0}} | + | :{{Par2|overlap|int|0}} |
− | :number of pixels to add at the top and bottom border or left and right border. Increase this if you see artifacts where the frame is split. | + | ::number of pixels to add at the top and bottom border or left and right border. Increase this if you see artifacts where the frame is split. |
− | {{Par2|splitvertical|bool|false}} | + | :{{Par2|splitvertical|bool|false}} |
− | :if true the frames are cut vertically (and the filter is allowed to change the height) else it is cut horizontally (and the filter is allowed to change the width). | + | ::if true the frames are cut vertically (and the filter is allowed to change the height) else it is cut horizontally (and the filter is allowed to change the width). |
==== MTi ==== | ==== MTi ==== | ||
− | {{FuncDef|MTi(clip ''clip'',string ''filter'')}} | + | :{{FuncDef|MTi(clip ''clip'', string ''filter'')}} |
− | {{Par2|clip|clip|last}} | + | :{{Par2|clip|clip|last}} |
− | :input clip. Must be mod2 height for RGB and YUY2 | + | ::input clip. Must be mod2 height for RGB and YUY2 color-spaces and mod4 height for YV12 colorspace |
− | {{Par2|filter|string|(no default)}} | + | :{{Par2|filter|string|(no default)}} |
− | :filter to run multithreaded. Note that the filter is allowed to change both width and height at the same time but only 1 input clip is allowed. It can be any built-in filter, | + | ::filter to run multithreaded. Note that the filter is allowed to change both width and height at the same time but only 1 input clip is allowed. It can be any built-in filter, Avisynth defined filter or external plugin filter as long as the restrictions are observed. |
==== MTsource ==== | ==== MTsource ==== | ||
− | {{FuncDef|MTsource(string ''filter'',int ''delta'',int ''threads'',int ''max_fetch'')}} | + | :{{FuncDef|MTsource(string ''filter'', int ''delta'', int ''threads'', int ''max_fetch'')}} |
− | {{Par2|filter|string|(no default)}} | + | :{{Par2|filter|string|(no default)}} |
− | :source filter to run multithreaded. Currently only internal and external source filters are supported (like DirectShowSource, | + | ::source filter to run multithreaded. Currently only internal and external source filters are supported (like [[DirectShowSource]], [[AviSource]], [[MPEG2Source]]). You can use an Avisynth defined filter or a non-source filter but it might crash or produce frame corruption. |
− | {{Par2|delta|int|1}} | + | :{{Par2|delta|int|1}} |
− | :this is how many frames there are between each frame request, so if you are only going to read every second frame set it to 2 or if you are reading the frames backwards set it to -1. | + | ::this is how many frames there are between each frame request, so if you are only going to read every second frame set it to 2 or if you are reading the frames backwards set it to -1. |
− | :More complex frame access patterns like SelectEvery(10,3,6,7) are not supported (but might work anyway as the requested frames are in the cache, there will just be some wasted memory from non requested frame in the cache). | + | ::More complex frame access patterns like [[SelectEvery]](10,3,6,7) are not supported (but might work anyway as the requested frames are in the cache, there will just be some wasted memory from non requested frame in the cache). |
− | {{Par2|threads|int|2}} | + | :{{Par2|threads|int|2}} |
− | :number of threads to run. Set this to the number of threads your computer is able to run concurrently. | + | ::number of threads to run. Set this to the number of threads your computer is able to run concurrently. |
− | {{Par2|max_fetch|int|30}} | + | :{{Par2|max_fetch|int|30}} |
− | :This is the maximum number of frames ahead of the currently requested frame that MTsource will fetch. Setting it to low will leaving the threads idle for most of the time and setting it to high will waste too much memory. | + | ::This is the maximum number of frames ahead of the currently requested frame that MTsource will fetch. Setting it to low will leaving the threads idle for most of the time and setting it to high will waste too much memory. |
=== Examples === | === Examples === | ||
− | + | Ordinary blur: | |
MT("blur(1)",2,2) | MT("blur(1)",2,2) | ||
Line 147: | Line 144: | ||
− | == | + | == Changes in MT 2.5.7.5 == |
− | + | === Abstract === | |
− | + | MT now contains the new functions '''SetMTMode'''() and '''GetMTMode'''() and is needed by MT.dll. Install it by overwriting avisynth.dll in your System32 directory (SysWOW64 on 64-bit Windows). Remember to back up your current avisynth.dll before installing the new one. | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
+ | === Technical info === | ||
− | + | These functions enable Avisynth to use more than one thread when processing filters. This is useful if you have more than one CPU/core or hyper-threading. This feature is still experimental. | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | These functions enable | + | |
=== Syntax === | === Syntax === | ||
− | {{FuncDef|GetMTMode(bool ''threads'')}} | + | ==== GetMTMode ==== |
− | {{Par2|threads|bool|false}} | + | :{{FuncDef|GetMTMode(bool ''threads'')}} |
− | :if true ''GetMTMode'' returns the number of threads used else the current mode is returned (see below). | + | :{{Par2|threads|bool|false}} |
+ | ::if true ''GetMTMode'' returns the number of threads used else the current mode is returned (see below). | ||
− | {{FuncDef|SetMTmode(int ''mode'',int ''threads'')}} | + | ==== SetMTmode ==== |
+ | :{{FuncDef|SetMTmode(int ''mode'',int ''threads'')}} | ||
− | Place this at the first line in the | + | :Place this at the first line in the Avisynth file to enable temporal (that is more than one frame is processed at the same time) multithreading. Use it later in the script to change the mode for the filters below it. |
− | {{Par2|mode|int|2}} | + | :{{Par2|mode|int|2}} |
− | :there are 6 modes | + | ::there are 6 modes: |
− | :* Mode 1 is the fastest but only works with a few filter | + | ::* Mode 1 is the fastest but only works with a few filter |
− | :* Mode 2 should work with most filters but uses more memory | + | ::* Mode 2 should work with most filters but uses more memory |
− | :* Mode 3 should work with some of the filters that don't work with mode 2 but is slower | + | ::* Mode 3 should work with some of the filters that don't work with mode 2 but is slower |
− | :* Mode 4 is a combination of mode 2 and 3 and should work with even more filter but is both slower and uses more memory | + | ::* Mode 4 is a combination of mode 2 and 3 and should work with even more filter but is both slower and uses more memory |
− | :* Mode 5 is slowest (slower than not using SetMTMode) but should work with all filters that don't require linear frameserving (that is, the frames come in order (frame 0,1,2 ... last)). | + | ::* Mode 5 is slowest (slower than not using SetMTMode) but should work with all filters that don't require linear frameserving (that is, the frames come in order (frame 0,1,2 ... last)). |
− | :* Mode 6 is a modified mode 5 that might be slightly faster | + | ::* Mode 6 is a modified mode 5 that might be slightly faster |
− | :A more detailed explanation of the | + | ::A more detailed explanation of the modes 1 and 2 can be read here: [[MT_modes_explained|MT modes explained]] |
− | {{Par2|threads|int|0}} | + | :{{Par2|threads|int|0}} |
− | :number of threads to use. Set to 0 to set it to the number of processors available. It is not possible to change the number of threads other than in the first SetMTMode. | + | ::number of threads to use. Set to 0 to set it to the number of processors available. It is not possible to change the number of threads other than in the first SetMTMode. |
− | === Example | + | === Example === |
SetMTMode(2,0) #enables multihreading using thread = to the number of available processors and mode 2 | SetMTMode(2,0) #enables multihreading using thread = to the number of available processors and mode 2 | ||
− | LoadPlugin("...\LoadPluginEX.dll") #needed to load | + | LoadPlugin("...\LoadPluginEX.dll") #needed to load Avisynth 2.0 plugins |
LoadPlugin("...\DustV5.dll") #Loads Pixiedust | LoadPlugin("...\DustV5.dll") #Loads Pixiedust | ||
import("limitedsharpen.avs") | import("limitedsharpen.avs") | ||
Line 206: | Line 192: | ||
subtitle("Number of threads used: "+string(GetMTMode(true))+" Current MT Mode: "+string(GetMTMode())) #display mode and number of threads in use | subtitle("Number of threads used: "+string(GetMTMode(true))+" Current MT Mode: "+string(GetMTMode())) #display mode and number of threads in use | ||
− | + | == How to develop thread-safe filters == | |
− | Filter construction | + | Filter construction and destruction is single threaded. Only calls to <tt>GetFrame</tt> are multithreaded. |
− | No linear frame order is assured (unless MT is used instead of | + | No linear frame order is assured (unless <tt>MT</tt> is used instead of <tt>SetMTMode</tt>) and for each mode there are different restrictions: |
− | * Mode 1: all access to class variables, global variables and static variables must be | + | * Mode 1: all access to class variables, global variables and static variables must be thread-safe by using appropriate locking (<tt>Enter/LeaveCriticalSection</tt> etc, no locking needed for read-only variables) because more than 1 thread may access a class instance at a time. |
− | * Mode 2: access to class variable doesn't have to be | + | * Mode 2: access to class variable doesn't have to be thread-safe because there is only 1 instance of the class per thread. All global/static variable access must be thread-safe. Because each class instance only processes every other frame an internal cache (that is a cache inside the filter) won't work well. I have created <tt>PClipLocalStorage</tt> to share a pointer between different filter instances. |
− | * Mode 3: Only 1 thread is allowed to execute code from the filter at the same time. When child->GetFrame is called another thread can enter the filter and execute code. That means that class variables/global variables/static variables shouldn't be assigned to any values before the lastchild->GetFrame has been called. Instead local function variables should be used like this: | + | * Mode 3: Only 1 thread is allowed to execute code from the filter at the same time. When <tt>child->GetFrame</tt> is called another thread can enter the filter and execute code. That means that class variables/global variables/static variables shouldn't be assigned to any values before the <tt>lastchild->GetFrame</tt> has been called. Instead local function variables should be used like this: |
PVideoFrame __stdcall AdjustFocusV::GetFrame(int n, IScriptEnvironment* env) { | PVideoFrame __stdcall AdjustFocusV::GetFrame(int n, IScriptEnvironment* env) { | ||
Line 236: | Line 222: | ||
} | } | ||
− | But not like this: | + | :But not like this: |
PVideoFrame TemporalSoften::GetFrame(int n,IScriptEnvironment* env) { | PVideoFrame TemporalSoften::GetFrame(int n,IScriptEnvironment* env) { | ||
Line 255: | Line 241: | ||
} | } | ||
− | + | :when using mode 3 there is no need for thread-safe access to class variables. Because there is only 1 instance of the class that processes all frames, internal caches will work much better. The bad thing is only 1 thread can execute the filter at a time, so if it's the only slow filter in the script the speed increase won't be that big. | |
− | * Mode 4: a combination of mode 2 and 3, so it's okay to assign class variables before the last child-> | + | * Mode 4: a combination of mode 2 and 3, so it's okay to assign class variables before the last <tt>child->GetFrame</tt> has been called because there is a class instance per thread, but the problem with internal caches is the same as mode 2 |
* Mode 5: No restrictions. | * Mode 5: No restrictions. | ||
Line 263: | Line 249: | ||
* Mode 6: A slightly modified version of mode 5 that might be a little faster. | * Mode 6: A slightly modified version of mode 5 that might be a little faster. | ||
− | === PClipLocalStorage === | + | ==== PClipLocalStorage ==== |
− | Here is an example on how the PClipLocalStorage can be used to share a cache between multiple instances(that are created with mode=2,4): | + | Here is an example on how the <tt>PClipLocalStorage</tt> can be used to share a cache between multiple instances (that are created with mode=2,4): |
class Cache | class Cache | ||
Line 320: | Line 306: | ||
} | } | ||
− | == | + | == Changelog == |
− | + | ||
− | + | * 0.1 - first release. | |
+ | * 0.2 - Should be more thread safe. | ||
+ | * 0.21 - forgot to comment out a Sleep(0) | ||
+ | * 0.25 - Added the ''splitvertical'' option | ||
+ | * 0.3 - More stable(and slower) | ||
+ | * 0.4 - Includes a custom version of Avisynth 2.56 beta that should speed things up | ||
+ | * 0.41 - Minor speed increase | ||
+ | * 0.5 - Requires the included modified Avisynth 2.5.6 or Avisynth 2.6 | ||
+ | * 0.6 - Bugfix: height can be changed with ''splitvertical=true'' without crashing. | ||
+ | ::Also includes modified Avisynth MT 2.5.7.3 | ||
+ | * 0.7 - two new filters: ''MTi'', ''MTsource'' and Avisynth MT 2.5.7.5 | ||
+ | * Avisynth MT 2.5.7.3 - two new functions ''SetMTMode'', ''GetMTMode'' | ||
+ | |||
+ | == Links == | ||
− | [http://web.archive.org/web/20130308191718/http://avisynth.org/mediawiki/MT_support_page MT support page] | + | *[http://www.avisynth.nl/users/tsp/ tsp's filter page] |
+ | *[http://forum.doom9.org/showthread.php?t=94996 official thread on doom9] | ||
+ | :*Please read this page and the support page below before asking for help. | ||
+ | *[http://web.archive.org/web/20130308191718/http://avisynth.org/mediawiki/MT_support_page MT support page] |
Latest revision as of 13:35, 23 April 2017
Abstract | |
---|---|
Author | tsp |
Version | 0.7 |
Download | MT 0.7 (includes modified Avisynth 2.5.7.5) |
Category | Meta-Filters |
Requirements |
|
License | GPL v2 |
Discussion |
Contents |
Note: Not maintained anymore; use Avisynth 2.6 builds instead.
[edit] MT 0.7
[edit] Abstract
MT is a filter that enables other filters to be run multithreaded. This should hopefully speed up processing on hyperthreaded/multicore processors or multiprocessor systems.
Always remember to judge the result by looking at the speed improvement - not the CPU utilization.
[edit] Technical info
MT is a filter that splits a frame up into smaller fragments that are processed in individual threads, allowing full utilization of multiprocessor or hyper-thread enabled computers. I tested it on my old abit bp6 with 2x celeron 400 MHz and it increased the speed by 40%. Note that if you are already getting 100% CPU utilization when processing Avisynth scripts (eg if you're encoding to DivX/XviD) you don't need to use this filter.
The filter works like this Avisynth function:
function PseudoMT(clip c,string filter) { a=eval("c.crop(0,0,src.width/2,src.height)."+filter) b=eval("c.crop(src.width/2,0,src.width/2,src.height)."+filter) stackhorizontal(a,b) }
The only difference is that a and b are executed in parallel and it is possible to split the frame into more than 2 pieces. If the filter works with the above script, it should work with MT if the filter is thread safe. Dust does not work with the above script, so if you want to use iiP, use another denoiser or get Steady to fix the bug.
[edit] Limitations
The filter to be run must accept only one input clip and that is taken from the special variable last. Also the filter should not rely on the content of the whole frame (like smart deinterlacers) else there is a risk that only part of the frame will be processed. The filter should also be thread safe. Most filters are thread safe but some will produce a wrong result or crash.
[edit] Installation
Copy mt.dll into the Avisynth plugin directory and copy the included avisynth.dll into your System32 directory (SysWOW64 on 64-bit Windows) or where avisynth.dll is located. Remember to back up the old avisynth.dll (rename it or something) if you don't have version 2.6 installed.
From version 0.7 two other filters are included too:
- MTi creates two threads and lets each thread process one field, combining them like this Avisynth function:
function PseudoMTi(clip c,string filter) { a=eval("c.AssumeFieldBased().SeparateFields.selecteven()."+filter) b=eval("c.AssumeFieldBased().SeparateFields.selectodd()."+filter) interleave(a,b).weave() }
- As in the previous example, a and b are executed in parallel. Note that only two threads are created so it will only use two (virtual) cores.
- MTsource is used to run source filters multithreaded. It works like this:
function PseudoMTsource(string filter) { SetMTmode(2) eval(filter) SetMtmode(0) }
- Unlike the two other filters, MTsource is a temporal filter that fetches frames ahead of time and stores them in the cache for fast retrieval.
[edit] Syntax
[edit] MT
- MT(clip clip, string filter, int threads, int overlap, bool splitvertical)
- clip clip = last
- input clip
- string filter = (no default)
- filter to run multithreaded. Note that the filter must not change both the frame height and width (but colorspace is okay) and that only 1 input clip is allowed. It can be any built-in filter, Avisynth defined filter or external plugin filter as long as the restrictions are observed.
- int threads = 2
- number of threads to run. Set this to the number of threads your computer is able to run concurrently.
- int overlap = 0
- number of pixels to add at the top and bottom border or left and right border. Increase this if you see artifacts where the frame is split.
- bool splitvertical = false
- if true the frames are cut vertically (and the filter is allowed to change the height) else it is cut horizontally (and the filter is allowed to change the width).
[edit] MTi
- MTi(clip clip, string filter)
- clip clip = last
- input clip. Must be mod2 height for RGB and YUY2 color-spaces and mod4 height for YV12 colorspace
- string filter = (no default)
- filter to run multithreaded. Note that the filter is allowed to change both width and height at the same time but only 1 input clip is allowed. It can be any built-in filter, Avisynth defined filter or external plugin filter as long as the restrictions are observed.
[edit] MTsource
- MTsource(string filter, int delta, int threads, int max_fetch)
- string filter = (no default)
- source filter to run multithreaded. Currently only internal and external source filters are supported (like DirectShowSource, AviSource, MPEG2Source). You can use an Avisynth defined filter or a non-source filter but it might crash or produce frame corruption.
- int delta = 1
- this is how many frames there are between each frame request, so if you are only going to read every second frame set it to 2 or if you are reading the frames backwards set it to -1.
- More complex frame access patterns like SelectEvery(10,3,6,7) are not supported (but might work anyway as the requested frames are in the cache, there will just be some wasted memory from non requested frame in the cache).
- int threads = 2
- number of threads to run. Set this to the number of threads your computer is able to run concurrently.
- int max_fetch = 30
- This is the maximum number of frames ahead of the currently requested frame that MTsource will fetch. Setting it to low will leaving the threads idle for most of the time and setting it to high will waste too much memory.
[edit] Examples
Ordinary blur:
MT("blur(1)",2,2)
also user defined function (uses variableblur):
MT("unsharp(2,0.7)",2,2) function unsharpen(clip c,float variance,float k) { blr=binomialBlur(c,vary=variance,varc=2,Y=3,U=2,V=2) return yv12lutxy(blr,c,"y x - "+string(k)+" * y +",y=3,u=2,v=2) }
This one will not produce the intended result but shows how to use the triple quotes:
MT(""" subtitle("Doh") """,4,0)
Example of MTi
MTi("fft3dfilter()")
produces nearly the same result as
MT("fft3dfilter(interlaced=true)",threads=2)
but for filters that don't natively support interlaced content, it can be easier to use MTi()
Example of MTsource()
ir=MTSource(""" imagereader("c:\test.png") """,delta=1,threads=2,max_fetch=10) as=MTSource(""" avisource("c:\test.avi") """,delta=-1) #delta negative due to reverse() ms=MTSource(""" MPEG2Source("c:\test.d2v") """,delta=9) #delta is 9 due to selectevery(9,1) stackhorizontal(ir.trim(0,100),as.reverse().trim(0,100),ms.selectevery(9,1).trim(0,100))
[edit] Changes in MT 2.5.7.5
[edit] Abstract
MT now contains the new functions SetMTMode() and GetMTMode() and is needed by MT.dll. Install it by overwriting avisynth.dll in your System32 directory (SysWOW64 on 64-bit Windows). Remember to back up your current avisynth.dll before installing the new one.
[edit] Technical info
These functions enable Avisynth to use more than one thread when processing filters. This is useful if you have more than one CPU/core or hyper-threading. This feature is still experimental.
[edit] Syntax
[edit] GetMTMode
- GetMTMode(bool threads)
- bool threads = false
- if true GetMTMode returns the number of threads used else the current mode is returned (see below).
[edit] SetMTmode
- SetMTmode(int mode,int threads)
- Place this at the first line in the Avisynth file to enable temporal (that is more than one frame is processed at the same time) multithreading. Use it later in the script to change the mode for the filters below it.
- int mode = 2
- there are 6 modes:
- Mode 1 is the fastest but only works with a few filter
- Mode 2 should work with most filters but uses more memory
- Mode 3 should work with some of the filters that don't work with mode 2 but is slower
- Mode 4 is a combination of mode 2 and 3 and should work with even more filter but is both slower and uses more memory
- Mode 5 is slowest (slower than not using SetMTMode) but should work with all filters that don't require linear frameserving (that is, the frames come in order (frame 0,1,2 ... last)).
- Mode 6 is a modified mode 5 that might be slightly faster
- there are 6 modes:
- A more detailed explanation of the modes 1 and 2 can be read here: MT modes explained
- int threads = 0
- number of threads to use. Set to 0 to set it to the number of processors available. It is not possible to change the number of threads other than in the first SetMTMode.
[edit] Example
SetMTMode(2,0) #enables multihreading using thread = to the number of available processors and mode 2 LoadPlugin("...\LoadPluginEX.dll") #needed to load Avisynth 2.0 plugins LoadPlugin("...\DustV5.dll") #Loads Pixiedust import("limitedsharpen.avs") src=AVIsource("test.avi") SetMTMode(5) #change the mode to 5 for the lines below src=src.converttoyuy2().PixieDust()#Pixiedust needs mode 5 to function. SetMTMode(2) #change the mode back to 2 src.LimitedSharpen() #because LimitedSharpen works well with mode 2 subtitle("Number of threads used: "+string(GetMTMode(true))+" Current MT Mode: "+string(GetMTMode())) #display mode and number of threads in use
[edit] How to develop thread-safe filters
Filter construction and destruction is single threaded. Only calls to GetFrame are multithreaded. No linear frame order is assured (unless MT is used instead of SetMTMode) and for each mode there are different restrictions:
- Mode 1: all access to class variables, global variables and static variables must be thread-safe by using appropriate locking (Enter/LeaveCriticalSection etc, no locking needed for read-only variables) because more than 1 thread may access a class instance at a time.
- Mode 2: access to class variable doesn't have to be thread-safe because there is only 1 instance of the class per thread. All global/static variable access must be thread-safe. Because each class instance only processes every other frame an internal cache (that is a cache inside the filter) won't work well. I have created PClipLocalStorage to share a pointer between different filter instances.
- Mode 3: Only 1 thread is allowed to execute code from the filter at the same time. When child->GetFrame is called another thread can enter the filter and execute code. That means that class variables/global variables/static variables shouldn't be assigned to any values before the lastchild->GetFrame has been called. Instead local function variables should be used like this:
PVideoFrame __stdcall AdjustFocusV::GetFrame(int n, IScriptEnvironment* env) { PVideoFrame frame = child->GetFrame(n, env); //Assigned to a local variable so this will work in mode 3 env->MakeWritable(&frame); if (!line) line = new uc[frame->GetRowSize()+32]; uc* linea = (uc*)(((int)line+15) & -16);// Align 16 uc* buf = frame->GetWritePtr(); int pitch = frame->GetPitch(); int row_size = vi.RowSize(); int height = vi.height; memcpy(linea, buf, row_size); // First row - map centre as upper if ((pitch >= ((row_size+7) & -8)) && (env->GetCPUFlags() & CPUF_MMX)) { AFV_MMX(linea, buf, height, pitch, row_size, amount); } else { AFV_C(linea, buf, height, pitch, row_size, amount); } return frame; }
- But not like this:
PVideoFrame TemporalSoften::GetFrame(int n,IScriptEnvironment* env) { __int64 i64_thresholds = 0x1000010000100001i64; int radius = (kernel-1) / 2 ; int c= 0; // Just skip if silly settings if((!luma_threshold)&& (!chroma_threshold) || (!radius)) return child->GetFrame(n,env); for(int p= 0;p<16;p++) planeDisabled[p]=false; for(p= n-radius;p<=n+radius;p++) { frames[p+radius-n] = child->GetFrame(min(vi.num_frames-1,max(p,0)), env); //GetFrame assigned to class variable frames. This wouldn't work with Mode 3 //because the next thread that enters this getframe will overwrite the result // from the last thread } //do stuff }
- when using mode 3 there is no need for thread-safe access to class variables. Because there is only 1 instance of the class that processes all frames, internal caches will work much better. The bad thing is only 1 thread can execute the filter at a time, so if it's the only slow filter in the script the speed increase won't be that big.
- Mode 4: a combination of mode 2 and 3, so it's okay to assign class variables before the last child->GetFrame has been called because there is a class instance per thread, but the problem with internal caches is the same as mode 2
- Mode 5: No restrictions.
- Mode 6: A slightly modified version of mode 5 that might be a little faster.
[edit] PClipLocalStorage
Here is an example on how the PClipLocalStorage can be used to share a cache between multiple instances (that are created with mode=2,4):
class Cache { public: //These function should be threadsafe. The most simple way is to use a //critical section like this PVideoFrame GetCachedFrame(int framenumber) { EnterCriticalSection(&cs); //Code //... LeaveCriticalSection(&cs); return retval; } SetCachedFrame(PVideoFrame frame); private: CRITICAL_SECTION cs; } class Sample : public GenericVideoFilter{ public: Sample(PClip _child, IScriptEnvironment* env); ~Sample(); PVideoFrame __stdcall GetFrame(int n, IScriptEnvironment* env); protected: PClipLocalStorage cls; Cache* FrameCache; } Sample::Sample(PClip _child, IScriptEnvironment* env) : GenericVideoFilter(_child),cls(env) { //if the cache has not been created yet GetValue will return 0 if(cls->GetValue()==0) { //create the cache and save the address in the PClipLocalStorage FrameCache = new Cache();cls->SetValue(static_cast(FrameCache)); } // The cache has been created so assign the address to FrameCache else { FrameCache=static_cast(cls->GetValue()); } } Sample::~Sample() { //only delete FrameCache if it is not delete yet. if(cls->GetValue()!=0) { delete FrameCache; cls->SetValue(0);//Signal that the cache is deleted } }
[edit] Changelog
- 0.1 - first release.
- 0.2 - Should be more thread safe.
- 0.21 - forgot to comment out a Sleep(0)
- 0.25 - Added the splitvertical option
- 0.3 - More stable(and slower)
- 0.4 - Includes a custom version of Avisynth 2.56 beta that should speed things up
- 0.41 - Minor speed increase
- 0.5 - Requires the included modified Avisynth 2.5.6 or Avisynth 2.6
- 0.6 - Bugfix: height can be changed with splitvertical=true without crashing.
- Also includes modified Avisynth MT 2.5.7.3
- 0.7 - two new filters: MTi, MTsource and Avisynth MT 2.5.7.5
- Avisynth MT 2.5.7.3 - two new functions SetMTMode, GetMTMode
[edit] Links
- Please read this page and the support page below before asking for help.