MaskTools2

From Avisynth wiki
(Difference between revisions)
Jump to: navigation, search
(Exernal Links: update with 2.2.x repo and forum link)
(Exernal Links: Typo)
Line 452: Line 452:
 
==Exernal Links ==
 
==Exernal Links ==
 
* [https://github.com/pinterf/masktools/tree/16bit GitHub] - Source code repository for MaskTools2 2.2.x.
 
* [https://github.com/pinterf/masktools/tree/16bit GitHub] - Source code repository for MaskTools2 2.2.x.
* [http://github.com/tp7/masktools GitHub] - Source code repository for MaskTols2 b1.
+
* [http://github.com/tp7/masktools GitHub] - Source code repository for MaskTools2 b1.
 
* [http://forum.doom9.org/showthread.php?t=98985 Doom9 Forum] - Original MaskTools2 discussion thread.
 
* [http://forum.doom9.org/showthread.php?t=98985 Doom9 Forum] - Original MaskTools2 discussion thread.
 
* [https://forum.doom9.org/showthread.php?t=174333 Doom9 Forum] - v2.2.x discussion thread
 
* [https://forum.doom9.org/showthread.php?t=174333 Doom9 Forum] - v2.2.x discussion thread

Revision as of 15:18, 18 March 2017

see also MaskTools

Abstract
Author tp7, Manao, mg262, Kurosu
Version 2.0b1
Download

Category Support filters
License MIT but binaries are GPLv2
Discussion Doom9 Thread

Contents

MaskTools2 v2.2.x

This is a fork of tp7's MaskTools2 plugin. It works for high bit depth under AviSynth+ (10-16 bits, float), extends existing and add new features, contains a bugfix, and AVX and AVX2 optimizations for some filters. This branch will be called 2.2.x opposed to 2.0.* like the previous MaskTools2. At the moment "mt_polish" functionality is not available on XP builds.

=== Difference to Masktools2 b1

  • project moved to Visual Studio 2015 Update 3.
    Requires VS2015 Update 3 redistributables
  • Fix: mt_merge (and probably other multi-clip filters) may result in corrupted results under specific circumstances, due to using video frame pointers which were already released from memory
  • no special function names for high bit depth filters
  • filters are auto registering their mt mode as MT_NICE_FILTER for Avisynth+
  • Avisynth+ high bit depth support (incl. planar RGB, but color spaces with alpha plane are not yet supported)

All filters are now supporting 10, 12, 14, 16 bits and float Note that parameters like threshold are not scaled automatically to the current bit depth

  • YV411 (8 bit 4:1:1) support
  • mt_merge accepts 4:2:2 clips when luma=true (8-16 bit)
  • mt_merge accepts 4:1:1 clips when luma=true
  • mt_merge to discard U and V automatically when input is greyscale
  • some filters got AVX (float) and AVX2 (integer) support:
    • mt_merge: 8-16 bit: AVX2, float:AVX
    • mt_logic: 8-16 bit: AVX2, float:AVX
    • mt_edge: 10-16 bit and 32 bit float: SSE2/SSE4 optimization
    • mt_edge: 32 bit float AVX
  • mt_polish to recognize new constants and scaling operator, and some other operators introduced in earlier versions.
  • new: mt_lutxyza. Accepts four clips. 4th variable name is 'a' (besides x, y and z)
  • new: mt_luts: weight expressions as an addon for then main expression(s) (martin53's idea)
    • wexpr
    • ywExpr, uwExpr, vwExpr

If the relevant parameter strings exist, the weighting expression is evaluated for each source/neighborhood pixel values (lut or realtime, depending on the bit depth and the "realtime" parameter). Then the usual lut result is premultiplied by this weight factor before it gets accumulated.
Weights are float values. Weight luts are x,y (2D) luts, similarly to the base working mode, where x is the base pixel, y is the current pixel from the neighbourhood, defined in "pixels".
When the weighting expression is "1", the result is the same as the basic weightless mode. For modes "average" and "std" the weights are summed up. Result is: sum(value_i*weight_i)/sum(weight_i). When all weights are equal to 1.0 then the expression will result in the average: sum(value_i)/n. Same logic works for min/max/median/etc., the "old" lut values are pre-multiplied with the weights before accumulation.

  • expression syntax supporting bit depth independent expressions
    • bit-depth aware scale operators
      Note: the name/character of operator may change in the future, e.g. #B becomes @B to avoid compatibility problems.
      • operator #B scales from 8 bit to current bit depth using bit-shifts
        Use this for YUV. "235 #B" always results in max luma
      • operator #F scales from 8 bit to current bit depth using full range stretch
        "255 #F" results in maximum pixel value of current bit depth.
        Calculation: x/255*65535 for a 8->16 bit sample (rgb)
    • hints for non-8 bit based constants:
      Added configuration keywords i8, i10, i12, i14, i16 and f32 in order to tell the expression evaluator the bit depth of the values that are to scale by #B and #F operators (see "sbitdepth"). By default #B and #F scales from 8 bit to the bit depth of the clip.


i8 .. i16 and f32 sets the default conversion base to 8..16 bits or float, respectively.
These keywords can appear anywhere in the expression, but only the last occurence will be effective for the whole expression.
Examples

 8 bit video, no modifier: "x y - 256 #B *" evaluates as "x y - 256 *"
 10 bit video, no modifier: "x y - 256 #B *" evaluates as "x y - 1024 *"
 10 bit video: "i16 x y - 65536 #B *" evaluates as "x y - 1024 *"
 8 bit video: "i10 x y - 512 #B *" evaluates as "x y - 128 *"                  


    • new pre-defined, bit depth aware constants
      • bitdepth: automatic silent parameter of the lut expression
      • range_half --> autoscaled 128 or 0.5 for float
      • range_max --> 255/1023/4095/16383/65535 or 1.0 for float
      • range_size --> 256/1024...65536
      • ymin, ymax, cmin, cmax --> 16/235 and 16/240 autoscaled.


Example #1 (bit depth dependent, all constants are treated as-is):

 expr8_luma = "x 16 - 219 / 255 *"
 expr10_luma = "x 64 - 876 / 1023 *"
 expr16_luma = "x 4096 - 56064 / 65535 *"


Example #2 (new, with auto-scale operators )

 expr_luma =  "x 16 #B - 219 #B / 255 #F *"
 expr_chroma =  "x 16 #B - 224 #B / 255 #F *"


Example #3 (new, with constants)

 expr_luma = "x ymin - ymax ymin - / range_max *"
 expr_chroma = "x cmin - cmax cmin - / range_max *"
  • new expression syntax: auto scale modifiers for float clips (test, may change):

Keyword at the beginning of the expression:

    • clamp_f_i8, clamp_f_i10, clamp_f_i12, clamp_f_i14 or clamp_f_i16 for scaling and clamping
    • clamp_f_f32 or clamp_f: for clamping the result to 0..1


Input values 'x', 'y', 'z' and 'a' are autoscaled by 255.0, 1023.0, ... 65535.0 before the expression evaluation, so the working range is similar to native 8, 10, ... 16 bits. The predefined constants 'range_max', etc. will behave for 8, 10,..16 bits accordingly.
The result is automatically scaled back to 0..1 _and_ is clamped to that range. When using clamp_f_f32 (or clamp_f) the scale factor is 1.0 (so there is no scaling), but the final clamping will be done anyway. No integer rounding occurs.

 expr = "x y - range_half +"  # good for 8..32 bits but float is not clamped
 expr = "clamp_f y - range_half +"  # good for 8..32 bits and float clamped to 0..1
 expr = "x y - 128 + "  # good for 8 bits
 expr = "clamp_f_i8 x y - 128 +" # good for 8 bits and float, float will be clamped to 0..1
 expr = "clamp_f_i8 x y - range_half +" # good for 8..32 bits, float will be clamped to 0..1
  • parameter "stacked" (default false) for filters with stacked format support

Stacked support is not intentional, but since tp7 did it, I did not remove the feature. Filters currently without stacked support will never have it.

  • parameter "realtime" for lut-type filters, slower but at least works on those bit depths where LUT tables would occupy too much memory.

For bit depth limits where realtime = true is set as the default working mode, see table below.
realtime=true can be overridden, one can experiment and force realtime=false even for a 16 bit lutxy (8GBytes lut table!, x64 only) or for 8 bit lutxzya (4 GBytes lut table)

  • Feature matrix
                  8 bit | 10-16 bit | float | stacked | realtime
     mt_invert      X         X         X        -
     mt_binarize    X         X         X        X
     mt_inflate     X         X         X        X
     mt_deflate     X         X         X        X
     mt_inpand      X         X         X        X
     mt_expand      X         X         X        X
     mt_lut         X         X         X        X      when float
     mt_lutxy       X         X         X        -      when bits>=14
     mt_lutxyz      X         X         X        -      when bits>=10
     mt_lutxyza     X         X         X        -      always
     mt_luts        X         X         X        -      when bits>=14 
     mt_lutf        X         X         X        -      when bits>=14   
     mt_lutsx       X         X         X        -      when bits>=10
     mt_lutspa      X         X         X        -
     mt_merge       X         X         X        X
     mt_logic       X         X         X        X
     mt_convolution X         X         X        -
     mt_mappedblur  X         X         X        -
     mt_gradient    X         X         X        -
     mt_makediff    X         X         X        X
     mt_average     X         X         X        X
     mt_adddiff     X         X         X        X
     mt_clamp       X         X         X        X
     mt_motion      X         X         X        -
     mt_edge        X         X         X        -
     mt_hysteresis  X         X         X        -
     mt_infix/mt_polish: available only on non-XP builds

MaskTools2 b1

This is a fork of Manao's MaskTools2 plugin. It mostly contains performance improvements, bugfixes and some little things that make the plugin more "mature". This branch will be called b* as opposed to a* like the original MaskTools2.

Difference to MaskTools2 a48

  • Works correctly with AviSynth 2.6 Alpha 4/5 and RC 1 (including MT). Doesn't work with previous alphas.
  • Much cleaner and easy to understand codebase, also the source code is now licensed under MIT.
  • No MMX-optimized versions. MMX is too old to support and is always slower than SSE2.
  • all luts: faster LUT calculation, faster startup, reduced memory footprint if the same LUT is used for multiple planes or some planes aren't processed. For example mt_lutxyz(c1, c2, c3, "x y + z -") will use only 16MBs of memory instead of 48MBs.
  • mt_lutspa: does not depend on source clip performance as it doesn't get requested at all (unless mode 2 is used). Always much faster (5-o9k times).
  • all filters: faster modes 2, 4 and 5 (copy), negative (memset) modes.
  • mt_hysteresis: 3-4 times better performance.
  • all luts: performance in mode 3 with an empty LUT is now identical to mode 2. Thus mt_lut(chroma="128") is a bit faster than Grayscale() instead of being much slower.
  • mt_merge: luma=true now supports YV24.
  • mt_luts: correct value is used as x. More info here.
  • sobel/roberts/laplace modes of mt_edge: better performance when SSSE3 is available.
  • mt_edge("cartoon"): 10 times faster when SSE2 is available.
  • all asm-optimized filters: same performance on any resolution up to mod-1. Original MaskTools2 used unoptimized version for any non-mod8 clips.

All filters were tested on a Core i7 860. Performance might be a bit different on other CPUs.

Download

MaskTools v2.2.4 (x86/x64)


Runtime dependencies:


MaskTools b1 comes in three variations:


Runtime dependencies:

**** vcredist_x86.exe is required for MaskTools2-x86
**** vcredist_x64.exe is required for MaskTools2-x64


Filters

MaskTools2 contain a set of filters designed to create, manipulate and use masks. Masks, in video processing, are a way to give a relative importance to each pixel. You can, for example, create a mask that selects only the green parts of the video, and then replace those parts with another video. To give the most control over the handling of masks, the filters will use the fact that each luma and chroma planes can be uncorrelated. That means that a single video will always be considered by the filters as 3 independent planes. That applies for masks as well, which means that a mask clip will in fact contain 3 masks, one for each plane.

The filters have a set of common parameters, that mainly concern what processing to do on each plane. All filters only work with planar colorspaces (Y8, YV12, YV16, and YV24 (AviSynth 2.5.8 only supports YV12!).
Beginning with v2.2.4 YUV and planar RGB 10-16 bit and 32 bit float color spaces are supported when using AviSynth+ (r2294-).

Here is an exhaustive list of the filters contained in MaskTools2 (see developer's page here for more information)

Filter Description Color format
Masks creation
Mt_edge

Creates edge masks.

Y8, YV12, YV16, YV24
Mt_motion

Creates motion masks.

Y8, YV12, YV16, YV24
Masks operation
Mt_invert

Inverses masks.

Y8, YV12, YV16, YV24
mt_binarize

Transforms soft masks into hard masks.

Y8, YV12, YV16, YV24
mt_logic

Combines masks using logic operators.

Y8, YV12, YV16, YV24
mt_hysteresis

Combines masks making the first one to grow into the second.

Y8, YV12, YV16, YV24
Masks merging
Mt_merge

Merges two clips according to a mask.

Y8, YV12, YV16, YV24
Morphologic operators
mt_expand

Expands the mask / the video.

Y8, YV12, YV16, YV24
mt_inpand

Inpands the mask / the video.

Y8, YV12, YV16, YV24
mt_inflate

Inflates the mask / the video.

Y8, YV12, YV16, YV24
mt_deflate

Deflates the mask / the video.

Y8, YV12, YV16, YV24
LUT operators
mt_lut

Applies an expression to all the pixels of a mask / video.

Y8, YV12, YV16, YV24
mt_lutxy

Applies an expression to all the pixels of two masks / videos.

Y8, YV12, YV16, YV24
mt_lutxyz

Applies an expression to all the pixels of three masks / videos.

Y8, YV12, YV16, YV24
mt_lutxyza

Applies an expression to all the pixels of four masks / videos. (v2.2.4-)

Y8, YV12, YV16, YV24 (+high bit depth)
mt_lutf

Creates a uniform picture from the collection of computation on pixels of two clips.

Y8, YV12, YV16, YV24
mt_luts

Applies an expression taking neighbouring pixels into.

Y8, YV12, YV16, YV24
mt_lutsx

Applies an expression taking neighbouring pixels into, in a different way.

Y8, YV12, YV16, YV24
mt_lutspa

Computes the value of a pixel according to its spatial position.

Y8, YV12, YV16, YV24
Support operators
mt_makediff

Substracts two clips.

Y8, YV12, YV16, YV24
mt_adddiff

Adds back a difference of two clips.

Y8, YV12, YV16, YV24
mt_clamp

Clamps a clip between two other clips.

Y8, YV12, YV16, YV24
mt_average

Averages two clips.

Y8, YV12, YV16, YV24
Convolutions
mt_convolution

Applies a separable convolution on the picture.

Y8, YV12, YV16, YV24
mt_mappedblur

Applies a special 3x3 convolution on the picture.

Y8, YV12, YV16, YV24
Helpers
mt_square

Creates a string describing a square.

Y8, YV12, YV16, YV24
mt_rectangle

Creates a string describing a rectangle.

Y8, YV12, YV16, YV24
mt_freerectangle

Creates a string describing a rectangle.

Y8, YV12, YV16, YV24
mt_diamond

Creates a string describing a diamond.

Y8, YV12, YV16, YV24
mt_losange

Creates a string describing a lozenge.

Y8, YV12, YV16, YV24
mt_freelosange

Creates a string describing a lozenge.

Y8, YV12, YV16, YV24
mt_circle

Creates a string describing a circle.

Y8, YV12, YV16, YV24
mt_ellipse

Creates a string describing an ellipse.

Y8, YV12, YV16, YV24
mt_freeellipse

Creates a string describing an ellipse.

Y8, YV12, YV16, YV24
mt_polish

Creates a reverse polish expression from an infix one.

-
mt_infix

Creates an infix expression from a reverse polish one.

-


Common parameters

As said previously, all the filters - except the helpers - share a common set of parameters. These parameters are used to tell what processing to do on each plane / channel, and what area of the video to process.

int  Y = 3
int  U = 1
int  V = 1
These three values describe the actual processing mode that is to be used on each plane / channel. Here is how the modes are coded :
  • x = -255..0 : all the pixels of the plane will be set to -x.
  • x = 1 : the plane will not be processed. That means the content of the plane after the filter is pure garbage.
  • x = 2 : the plane of the first input clip will be copied.
  • x = 3 : the plane will be processed with the processing the filter is designed to do.
  • x = 4 (when applicable) : the plane of the second input clip will be copied.
  • x = 5 (when applicable) : the plane of the third input clip will be copied.
As you can see, defaults parameters are chosen to only process the luma, and not to care about the chroma. It's because most video processing doesn't touch the chroma when handling 4:2:0.
string  chroma = ""
When defined, the value contained in this string will overwrite the U & V processing modes.
This is a nice addition proposed by mg262 that makes the filter more user friendly. Allowed values for chroma are:
  • "process" : set u = v = 3.
  • "copy" or "copy first" : set u = v = 2.
  • "copy second" : set u = v = 4.
  • "copy third" : set u = v = 5.
  • "xxx", where xxx is a number : set u = v = -xxx.
int  offX = 0
int  offY = 0
offx and offy are the top left coordinates of the box where the actual processing shall occur. Everything outside that box will be garbage.
int  w = -1
int  h = -1
w and h are the width and height of the processed box. -1 means that the box extends to the lower right corner of the video.
This also means that default settings are meant to process the whole picture.


Reverse polish notation

A lot of filters accept custom functions defined by an expression written in reverse polish notation. You may not be accustomed to this notation, so here are a few pointers :

  • The basic concept behind the notation is to write the operator / function after the arguments. Hence, "x + y" in infix notation becomes in reverse polish "x y +". "(3 + 5) * x" would become "3 5 + x *".
  • As you noticed in the last example, the great asset of the notation is that it doesn't need parenthesis. The expression that would have been enclosed in parenthesis ( "3 + 5" ) is correctly computed, because we read the expression from left to right, and because when the "+" is encountered, its two operands are unmistakeably known.
  • The supported operators are : "+", "-", "*", "/", "%" (modulo) and "^" (power)
  • The supported functions are : "sin", "cos", "tan", "asin", "acos", "atan", "exp", "log", "abs", "round", "clip", "min", "max", "ceil", "floor", "trunc".
  • Making the assumption that a positive float is "true", and a negative one is "false", we can also define boolean operators : "&", "|", "&!" (and not), "°" (xor), "@" (xor).
  • We can create boolean values with the following comparison operators : "<", ">", "<=", ">=", "!=", "==", "=" (same as "==").
  • Binary operators. Since internally all intermediate values are double, the parameters are first converted to 64 bit integer (unsigned or signed), and after the bit operation is done, the result will be converted back to double. So working with binary data is not fast, nor has real 64 bit integer precision.
    • Unsigned: "&u" (and), "|u" (or), "°u" (xor), "@u" (xor), "~u" (negate), "<<" or "<<s" (shift left), ">>" or ">>u" (shift right).
    • Signed: "&s" (and), "|s" (or), "°s" (xor), "@s" (xor), "~s" (negate), "<<s" (shift left), ">>s" (shift right).
  • autoscale operators (v2.2.4-)
    • scale from sbitdepth to bitdepth using bit-shift method: "@B" (#B until v2.2.4, may change to a meaningful char mnemonic)
    • scale from sbitdepth to bitdepth using full scale stretch method: "@F" (#F until v2.2.4)
  • The variable "x", "y", "z" and "a" (when applicable) contains the value of the pixel. It's an integer 0 to (2^bitdepth)-1 (e.g. 0..255 for 8 bits). For float the range is generally 0..1.0
  • The constant "pi" can be used.
  • The constant "bitdepth" (8-16, 32) for the input bitdepth (v2.2.4-)
  • The constant "sbitdepth" (8-16, 32) as the bitdepth of constants to scale (v2.2.4-)
  • Other predefined constants for bit-depth dependent values (v2.2.4-)
    • "range_half": 128 for 8 bits, 2^(bitdepth-1) in general, 0.5 for 32 bit float
    • "range_max": 255 for 8 bits, (2^bitdepth)-1 in general, 1.0 for 32 bit float
    • "range_size": 256 for 8 bits, (2^bitdepth) in general, 1.0 for 32 bit float
    • "ymin", "ymax": luma min and max value. 16 and 235 for 8 bits, shifted left by (bitdepth-8) in general
    • "cmin", "cmax": chroma min and max value. 16 and 240 for 8 bits, shifted left by (bitdepth-8) in general
  • Finally, there's a ternary operator : "?", which acts like a "if .. then .. else .."
  • All the computations are made in 64 bit doubles, and the final result is rounded to the nearest integer, in the range [0..255], [0..1023], .. [0..1.0] etc. depending on the clip's bitdepth.
  • Throughout the whole documentation, you'll be able to find plenty of examples.

Changelog

See the Github repository for v2.2.x, or MaskTools2 b1 GitHub commit log for tp7's changes; see this page for older changes.

Exernal Links

  • GitHub - Source code repository for MaskTools2 2.2.x.
  • GitHub - Source code repository for MaskTools2 b1.
  • Doom9 Forum - Original MaskTools2 discussion thread.
  • Doom9 Forum - v2.2.x discussion thread

Guides:




Back to External Filters

Personal tools