Expr

Revision as of 10:27, 2 July 2018

AVS+
This feature is specific to AviSynthPlus. It is not supported in other AviSynth versions.

Applies a mathematical function, defined by an expression string, on the pixels of the source clip(s). A different expression may be set for each color channel. Users of MaskTools may be familiar with this concept.

Expr generates assembly code that normally uses two 128 (SSE2) or 256 bit (AVX2) registers ("lanes"), thus processing 8 (SSE2)/16 (AVX2) pixels per internal cycle.
Experimental parameter, optSingleMode=true makes the internal compiler generate instructions for only one register (4/8 pixels - SSE2/AVX2). The parameter was introduced to test the speed of x86 code using one working register. Very-very complex expressions would use too many XMM/YMM registers which are then "swapped" to memory slots, that could be slow. Using optSingleMode = true may result in using less registers with no need for swapping them to memory slots.

bool optSSE2 = (auto)

If false, disable SSE2.

Enables/Disables SSE2 code generation when in non-AVX2 mode. Setting optSSE2=false and optAVX2=false forces expression processing in a slow interpreted way (C language)

bool clamp_float = false

if true: clamps 32 bit float to valid ranges, which is 0..1 for luma or for RGB color space and -0.5..0.5 for YUV chroma UV channels

Default false: as usual, 32 bit float pixels are not clamped

Ignored when scale_inputs scales 32bit-float type pixels

string scale_inputs = "none"

Autoscale any input bit depths to 8-16 bit for internal expression use, the conversion method is either full range (stretch) or limited YUV range (like bit shift). Feature is similar to the one in masktools2 v2.2.15

The primary reason of this feature is the "easy" usage of formerly written expressions optimized for 8 bits.

"int" : scales limited range videos, only integer formats (8-16bits) to 8 (or bit depth specified by 'i8'..'i16')
"intf": scales full range videos, only integer formats (8-16bits) to 8 (or bit depth specified by 'i8'..'i16')
"float" or "floatf" : only scales 32 bit float format to 8 bit range (or bit depth specified by 'i8'..'i16')
"all": scales videos to 8 (or bit depth specified by 'i8'..'i16') - conversion uses limited_range logic (mul/div by two's power)
"allf": scales videos to 8 (or bit depth specified by 'i8'..'i16') - conversion uses full scale logic (stretch)
"none": no magic

Expressions

Expr accepts 1 to 26 source clips, up to four expression strings (one per color plane), an optional output format string, and some debug options. Output video format is inherited from the first clip, when there is no format override. All clips have to match in their width, height and chroma subsampling.

Expressions are evaluated on each plane, Y, U, V (and A) or R, G, B (,A). When an expression string is not specified, the previous expression is used for that plane – except for plane A (alpha) which is copied by default. When an expression is an empty string ("") then the relevant plane will be copied (if the output clip bit depth is similar). When an expression is a single clip reference letter ("x") and the source/target bit depth is similar, then the relevant plane will be copied. When an expression is constant (after constant folding), then the relevant plane will be filled with an optimized memory fill method.

Example: Expr(clip, "255", "128, "128") fills all three planes.
Example: Expr(clip, "x", "range_half, "range_half") copies luma, fills U and V with 128/512/... (bit depth dependent)

Other optimizations: do not call GetFrame for input clips that are not referenced or plane-copied

Expressions are written in RPN.

Expressions use 32 bit float precision internally.

For 8..16 bit formats output is rounded and clamped from the internal 32 bit float representation to valid 8, 10, ... 16 bits range. 32 bit float output is not clamped at all.

Expr language/RPN elements

Clips: letters x, y, z, a..w. x is the first clip parameter, y is the second one, etc.
Math: * / + -
% (modulo), like fmod. Example: result = x - trunc(x/d)*d. Note: the internal 32-bit float can hold only a 24 bit integer number (approximately)
Math constant: pi
Functions: min, max, sqrt, abs, neg, exp, log, pow ^ (synonyms: pow and )
Function: clip three operand function for clipping. Example: x 16 240 clip means min((max(x,16),240)
Functions: sin cos tan asin acos atan (no SSE2/AVX2 optimization when they appear in Expr)
Logical: > < = >= <= and or xor not == & | != (synonyms: == and =, & and and, | and or)
Ternary operator: ? Example: x 128 < x y ?
Duplicate stack elements: dup, dupn (dup1, dup2, ...)
Swap stack elements: swap, swapn (swap1, swap2, ...)
Scale by bit shift: scaleb (operand is treated as being a number in 8 bit range unless i8..i16 or f32 is specified)
Scale by full scale stretch: scalef (operand is treated as being a number in 8 bit range unless i8..i16 or f32 is specified)

Bit-depth aware constants

ymin, ymax (ymin_a .. ymin_z for individual clips) - the usual luma limits (16..235 or scaled equivalents)
cmin, cmax (cmin_a .. cmin_z) - chroma limits (16..240 or scaled equivalents)
range_half (range_half_a .. range_half_z) - half of the range, (128 or scaled equivalents)
range_size, range_half, range_max (range_size_a .. range_size_z , etc..)

Keywords for modifying base bit depth

i8, i10, i12, i14, i16, f32 (used with scaleb and scalef)

Spatial input variables in expr syntax

sx, sy (absolute x and y coordinates, 0 to width-1 and 0 to height-1)

sxr, syr (relative x and y coordinates, from 0 to 1.0)

Internal variables

Uppercase A to Z for storing and loading intermediate results within the expression

Store: A@ .. Z@

Store and pop from stack: A^ .. Z^

Use: A..Z

Example: "x y - A^ x y 0.5 + + B^ A B / C@ x +"

frameno : use current frame number in expression. 0 <= frameno < clip_frame_count

time : calculation: time = frameno/clip_frame_count. Use relative time position in expression. 0 <= time < frameno/clip_frame_count

width, height: clip width and clip height

Pixel addressing

Indexed, addressable source clip pixels by relative x,y positions.

Syntax: x[a,b] where

'x': source clip letter a..z

'a': horizontal shift. -width < a < width

'b': vertical shift. -height < b < height

'a' and 'b' should be constant. e.g.: "x[-1,-1] x[-1,0] x[-1,1] y[0,-10] + + + 4 /"

When an pixel would come from off-screen, the pixels are cloned from the edge.

Optimized version of indexed pixels require SSSE3, and no AVX2 version is available. Non-SSSE3 falls back to C for the whole expression

Auto-scale inputs with "scale_inputs"

placeholder

Compared to MaskTools

Compared to MaskTools2 version 2.2.12, Expr has functionality similar to mt_lut, mt_lutxy, mt_lutxyz, mt_lutxyza and mt_lutspa.

MaskTools2 however is very slow for 10+ bit clips, when a LUT (lookup table) cannot be used. The expression is evaluated/interpreted at runtime for each pixel.

The JIT compiler in Expr (adapted from VapourSynth) turns the expression calculation into realtime assembly code which is much faster and basically bit depth independent.

In Expr:

Up to 26 clips are allowed (x,y,z,a,b,...w). Masktools handles only up to 4 clips with its mt_lut, mt_lutxy, mt_lutxyz, mt_lutxyza
Clips with different bit depths are allowed
Works with 32 bit floats instead of 64 bit double internally
Less functions (e.g. no bit shifts)
No float clamping and float-to-8bit-and-back load/store option
Logical 'false' is 0 instead of -1
The ymin, ymax, etc built-in constants can have a _X suffix, where X is the corresponding clip designator letter. E.g. cmax_z, range_half_x
mt_lutspa-like functionality is available through "sx", "sy", "sxr", "syr" internal predefined variables
No y= u= v= parameters with negative values for filling plane with constant value, constant expressions are changed into optimized "fill" mode

Examples

Average three clips:

 c = Expr(clip1, clip2, clip3, "x y + z + 3 /")

Using spatial feature:

 c = Expr(clip1, clip2, clip3, "sxr syr 1 sxr - 1 syr - * * * 4096 scaleb *", "", "")

Mandelbrot zoomer (original code and idea from here: https://forum.doom9.org/showthread.php?p=1738391#post1738391 )

a="X dup * Y dup * - A + T^ X Y 2 * * B + 2 min Y^ T 2 min X^ "
b=a+a
c=b+b
blankclip(width=960,height=640,length=1600,pixel_type="YUV420P8")
Expr("sxr 3 * 2 - -1.2947627 - 1.01 frameno ^ / -1.2947627 + A@ X^ syr 2 * 1 - 0.4399695 "
\ + "- 1.01 frameno ^ / 0.4399695 + B@ Y^ "+c+c+c+c+c+b+a+"X dup * Y dup * + 4 < 0 255 ?",
\ "128", "128")

For other ideas of spatial variables, see MaskTools2:mt_lutspa

Changes

r2724 (20180702)	new three operand function: clip new parameter "clamp_float" new parameter "scale_inputs"
r2574 (20171219)	new: Indexable source clip pixels by relative x,y positions like x[-1,1] new functions: sin cos tan asin acos atan new operator: % (modulo) new: Variables: uppercase letters A..Z for storing and reuse temporary results, frequently used computations. new: predefined expr variables 'frameno', 'time', 'width', 'height' fix: jitasm code generation at specific circumstances
r2544 (20171115)	optimization; fix scalef
r2542 (20171114)	first added

@@ Line 137: / Line 137: @@
 :: When an pixel would come from off-screen, the pixels are cloned from the edge.
 :: Optimized version of indexed pixels require SSSE3, and no AVX2 version is available. Non-SSSE3 falls back to C for the whole expression
+==== Auto-scale inputs with "scale_inputs" ====
+: placeholder
+:: placeholder
+:: placeholder
 ==== Compared to MaskTools ====

Expr

Revision as of 10:27, 2 July 2018

Contents

Syntax and Parameters

Expressions

Expr language/RPN elements

Bit-depth aware constants

Keywords for modifying base bit depth

Spatial input variables in expr syntax

Internal variables

Pixel addressing

Auto-scale inputs with "scale_inputs"

Compared to MaskTools

Examples

Changes

Views

Personal tools

Navigation

community

in other languages

Search

Tools