Nnedi3 resize16

From Avisynth wiki
Jump to: navigation, search
Abstract
Author mawen1250
Version v3.3
Download nnedi3_resize16_v3.3.avsi
Category Resizers
License
Discussion NMM-HD Thread - [Chinese]

Contents

Description

nnedi3_resize16 is an advanced script for image resizing and colorspace conversion.

Requirements

Required Plugins

Latest version of the following plugins are recommended unless stated otherwise.

Optional script:


Syntax and Parameters

Avisynth icon.png nnedi3_resize16_v3.3.avsi

nnedi3_resize16 (clip input, int "target_width", int "target_height", float "src_left", float "src_top", float "src_width", float "src_height",
\ string "kernel_d", string "kernel_u", float "f_d", float "f_u", int "taps",
\ float "a1", float "a2", float "a3", bool "invks_d", bool "invks_u", int "invkstaps", bool "noring",
\ int "nsize", int "nns", int "qual", int "etype", int "pscrn", int "threads",
\ float "ratiothr", bool "mixed", float "thr", float "elast", float "sharp",
\ string "output", bool "tv_range", string "cplace", string "matrix", string "curve", float "gcor",
\ int "Y", int "U", int "V", bool "lsb_in", bool "lsb", int "dither")


Input clip and resizing parameters

 input
 clip  input =
Input clip to be processed.


 target_width
 int  target_width =
Target width; default value is the width of the input clip.


 target_height
 int  target_height =
Target height; default value is the height of the input clip.


 src_left
 float  src_left = 0.0
 src_top
 float  src_top = 0.0
Coordinate of the top-left corner of the picture sub-area used as source for the resizing. They can be fractional. If negative, the picture is extended by replicating the left pixel column.
 src_width
 float  src_width = width(input)
 src_height
 float  src_height = height(input)
Size in pixels of the sub-area to resize. They can be fractional. If 0, the area has the same size as the source clip. If negative, they define coordinates relative to the bottom-right corner, in a Crop-like manner.
Default value is the width and height of the input clip.


  • Just like AviSynth's resizers you can use an expanded syntax which crops before resizing. The same operations are performed as if you crop just before resizing, there can be a slight beneficial speed difference.
    Note the edge semantics are slightly different, cropping gives a hard absolute boundary, the resizer filter lobes can extend into the cropped region but not beyond the physical edge of the image.
    Use crop to remove any hard borders or any other unwanted noise, using the resizer cropping may propagate the noise into the adjacent output pixels.
    Use the resizer cropping to maintain accurate edge rendering when cropping a part of a complete image.


Scaling Ratio Calculation

 ratiothr
 float  ratiothr = 1.125
  • When scale ratio is larger than ratiothr, use nnedi3+Dither_resize16 upscale method instead of pure Dither_resize16.
  • When horizontal/vertical scale ratio > "ratiothr", we assume it's upscaling
  • When horizontal/vertical scale ratio <= "ratiothr", we assume it's downscaling


Parameters for merging edge and flat upscaled clip

 mixed
 bool  mixed = true
nnedi3_resize16 uses nnedi3+Dither_resize16 for edge upscaling, this parameter defines whether to combine nnedi3+Dither_resize16(edge area) and Dither_resize16(flat area) in upscaling, which achieves higher precision upscaling result(mainly for flat area).


 thr
 float  thr = 1.0
The same with "thr" in Dither_limit_dif16, valid value range is (0, 10.0].
Threshold between reference data and filtered data.


 elast
 float  elast = 1.5
The same with "elast" in Dither_limit_dif16, valid value range is [1, 10.0].
To avoid artifacts, the threshold has some kind of elasticity. Value differences falling over this threshold are gradually attenuated, up to thr * elast. > 1.


  • PDiff: pixel value diff between flat clip and edge clip (edge clip as reference)
  • ODiff: pixel value diff between merged clip and edge clip (edge clip as reference)
  • PDiff, thr and elast is used to calculate ODiff:
  • ODiff = PDiff when [PDiff <= thr]
  • ODiff gradually smooths from thr to 0 when [thr <= PDiff <= thr * elast]
  • for elast>2.0, ODiff reaches maximum when [PDiff == thr * elast / 2]
  • ODiff = 0 when [PDiff >= thr * elast]
  • Larger "thr" will result in more pixels being taken from flat area upscaled clip (Dither_resize16)
  • Larger "thr" will result in less pixels being taken from edge area upscaled clip (nnedi3+Dither_resize16)
  • Larger "elast" will result in more pixels being blended from edge&flat area upscaled clip, for smoother merging


Parameters for nnedi3

 nsize
 int  nsize = 0
Sets the size of the local neighborhood around each pixel that is used by the predictor neural network.
Possible settings (x_diameter x y_diameter):
  • 0 - 8x6
  • 1 - 16x6
  • 2 - 32x6
  • 3 - 48x6
  • 4 - 8x4
  • 5 - 16x4
  • 6 - 32x4
For image enlargement it is recommended to use 0 or 4. Larger y_diameter settings will result in sharper output.
For deinterlacing larger x_diameter settings will allow connecting lines of smaller slope. However, what setting to use really depends on the amount of aliasing (lost information) in the source.
If the source was heavily low-pass filtered before interlacing then aliasing will be low and a large x_diameter setting wont be needed, and vice versa.


 nns
 int  nns = 3
Sets the number of neurons in the predictor neural network. Possible settings are 0, 1, 2, 3, and 4. 0 is fastest. 4 is slowest, but should give the best quality.
This is a quality vs speed option; however, differences are usually small. The difference in speed will become larger as 'qual' is increased.
  • 0 - 16
  • 1 - 32
  • 2 - 64
  • 3 - 128
  • 4 - 256


 qual
 int  qual = 1
Controls the number of different neural network predictions that are blended together to compute the final output value.
Each neural network was trained on a different set of training data. Blending the results of these different networks improves generalization to unseen data.
Possible values are 1 or 2. Essentially this is a quality vs speed option. Larger values will result in more processing time, but should give better results.
However, the difference is usually pretty small. I would recommend using qual>1 for things like single image enlargement.


 etype
 int  etype = 0
Controls which set of weights to use in the predictor nn. Possible settings:
  • 0 - weights trained to minimize absolute error
  • 1 - weights trained to minimize squared error


 pscrn
 int  pscrn = 2
Controls whether or not the prescreener neural network is used to decide which pixels should be processed by the predictor neural network and which can be handled by simple cubic interpolation.
The prescreener is trained to know whether cubic interpolation will be sufficient for a pixel or whether it should be predicted by the predictor nn. The computational complexity of the prescreener nn is much less than that of the predictor nn.
Since most pixels can be handled by cubic interpolation, using the prescreener generally results in much faster processing. The prescreener is pretty accurate, so the difference between using it and not using it is almost always unnoticeable.
Version 0.9.3 adds a new, faster prescreener with three selectable 'levels', which trade off the number of pixels detected as only requiring cubic interpolation versus incurred error.
Therefore, pscrn is now an integer with possible values of 0, 1, 2, 3, and 4.
  • 0 - no prescreening (same as false in prior versions)
  • 1 - original prescreener (same as true in prior versions)
  • 2 - new prescreener level 0
  • 3 - new prescreener level 1
  • 4 - new prescreener level 2
Higher levels for the new prescreener result in cubic interpolation being used on fewer pixels (so are slower, but incur less error). However, the difference is pretty much unnoticeable.
Level 2 is closest to the original prescreener in terms of incurred error, but is much faster.


 threads
 int  threads = 0
Controls how many threads will be used for processing. If set to 0, threads will be set equal to the number of detected processors.


Parameters for Dither_resize16

 kernel_d
 string  kernel_d = "Spline36Resize"
 kernel_u
 string  kernel_u = "Spline64Resize"
"kernelh","kernelv" of Dither_resize16; kernel_d is used in downscaling and kernel_u is used in upscaling.
Kernel used by the resizer. Possible values are:
"point" Nearest neighbor interpolation. Same as PointResize().
"rect" or "box" Box filter.
"linear" or "bilinear" Bilinear interpolation. Same as BilinearResize().
"cubic" or "bicubic" Bicubic interpolation. Same as BicubicResize(). The b and c variables are mapped on a1 and a2 and are both set to 1/3 by default.
"lanczos" Sinc function windowed by the central lobe of a sinc. Use taps to specify its impulse length. Same as LanczosResize().
"blackman" Blackman-Harris windowed sinc. Use taps to control its length. Same as BlackmanResize().
"blackmanminlobe" Another kind of Blackman windowed sinc, with a bit less ringing. Use taps for you know what.
"spline16" Cubic spline based kernel, 4 sample points. Same as Spline16Resize().
"spline36" Spline, 6 sample points. Same as Spline36Resize().
"spline64" Spline, 8 sample points. Same as Spline64Resize().
"spline" Generic splines, number of sample points is twice the taps parameter, so you can use taps = 6 to get a Spline144Resize() equivalent.
"gauss" or "gaussian" Gaussian kernel. The p parameter is mapped on a1 and controls the curve width. The higher p, the sharper. It is set to 30 by default. This resizer is the same as GaussResize(), but taps offers a control on the filter impulse length. For low p values (soft and blurry), it’s better to increase the number of taps to avoid truncating the Gaussian curve too early and creating artifacts.
"sinc" Truncated sinc function. Use taps to control its length. Same as SincResize().
"impulse" Offers the possibility to create your own kernel (useful for convolutions). Add your coefficients in the string after “impulse”, separated with spaces (ex: "impulse 1 2 1"). The number of coefficients must be odd. The curve is linearly interpolated between the provided points. You can oversample the impulse by setting kovrspl to a value > 1.


 f_d
 float  f_d = 1.0
 f_u
 float  f_u = 1.0
"fh","fv" of Dither_resize16; f_d is for downscaling, f_u is for upscaling.
Horizontal and vertical frequency factors, also known as inverse kernel support. They are multipliers on the theoretical kernel cutoff frequency in both directions.
Values below 1.0 spatially expand the kernel and blur the picture. Values over 1.0 shrink the kernel and let higher frequencies pass. The result will look sharper but more aliased.
The multiplicator is applied after the kernel scaling in case of downsizing. Negative values force the processing, even if the horizontal size doesn’t change. The filter will use the absolute parameter value.


 taps
 int  taps = 4
"taps" of Dither_resize16.
Some kernels have a variable number of sample points, given by this parameter. Actually this counts half the number of lobes (or equivalent) ; in case of downscaling, the actual number of sample points may be greater than the specified value. Range: 1–128


 a1
 float  a1 =
 a2
 float  a2 =
 a3
 float  a3 =
Specific parameters, depending on the selected kernel.


 invks_d
 bool  invks_d = false
 invks_u
 bool  invks_u = false
"invksh","invksv" of Dither_resize16; invks_d is used in downscaling, invks_u is used in upscaling.
Activates the kernel inversion mode for the specified direction (use invks for both). Inverting the kernel allows to “undo” a previous upsizing by compensating the loss in high frequencies, giving a sharper and more accurate output than classic kernels, closer to the original. This is particularly useful for clips upscaled with a bilinear kernel. All the kernel-related parameters specify the kernel to undo. The target resolution must be as close as possible to the initial resolution. The kernel inversion is mainly intended to downsize an upscaled picture. Using it for upsizing will not restore details but will give a slightly sharper look, at the cost of a bit of aliasing and ringing. This mode is somewhat equivalent to the debilinear plug-in but works with a different principle.


 invkstaps
 int  invkstaps = 5
In kernel inversion mode (invks=true), this parameter sets the number of taps for the inverted kernel. Use it as a tradeof between softness and ringing. Range: 1–128


 noring
 bool  noring = false
True use non-ringing algorithm of Dither_resize16 in flat area scaling
It actually doesn't make much sense for nnedi3_resize16(which uses nnedi3 for upscaling), while it may produce blurring and aliasing when downscaling. You'd better not setting it to True unless you know what you are doing.


Post-Process

 sharp
 int  sharp = 0
Strength of Contra-Sharpen mod, for sharper edge. 0 means no sharpening, common value is about 100.
Only* when {horizontal or vertical}{scale ratio}>{ratiothr} will sharpening take effect (when nnedi3 is used for upscaling).


Input / Output

 Y
 int  Y = 3
 U
 int  U = 3
 V
 int  V = 3
Choose what planes to process; works just like MaskTools2.
  • 2 : copy from input clip
  • 3 : process


 lsb_in
 bool  lsb_in = false
input clip is 16-bit stacked or not.


 lsb
 bool  lsb = false
Output clip is 16-bit stacked or not, processing precision is 16-bit or not.


 tv_range
 bool  tv_range = true
Input clip is TV-range (16-235) or PC-range (0-255).


 dither
 int  dither =
Dither mode for 16-bit to 8-bit conversion. If tv_range=true, it defaults to 6, if false it defaults to 50.
Dithering method:
−1 no dither, round to the closest value
0 8-bit ordered dither + noise.
6 Serpentine Floyd-Steinberg error diffusion + noise. Well-balanced algorithm.
7 Stucki error diffusion + noise. Looks “sharp” and preserve light edges and details well.
8 Atkinson error diffusion + noise. Generates distinct patterns but keeps clean the flat areas.
Modes 1 to 5 have no real interest over mode 0 and can be considered deprecated.


 output
 string  output =
Output format. Possible values are:
"Y8" Regular Y8 colorspace. Parameter "lsb" works on this output mode.
"YV12" Regular YV12 colorspace. Parameter "lsb" works on this output mode.
"YV16" Regular YV16 colorspace. Parameter "lsb" works on this output mode.
"YV24" Regular YV24 colorspace. Parameter "lsb" works on this output mode.
"RGB24" Regular RGB24 colorspace.
"RGB32" Regular RGB32 colorspace.
"RGB48YV12" 48-bit RGB conveyed on YV12. Use it for raw video export only. Not suitable for display or further processing (it will look like garbage).
"RGB48Y" 48-bit RGB. The components R, G and B are conveyed on three YV12 or Y8 (if supported) stack16 clips interleaved on a frame basis.
If output is not defined it will default to the colorspace of the input clip.


 cplace
 string  cplace = "MPEG2"
Placement of the chroma subsamples. Can be one of these strings:
"MPEG1" 4:2:0 subsampling used in MPEG-1. Chroma samples are located on the center of each group of 4 pixels.
"MPEG2" Subsampling used in MPEG-2 4:2:x. Chroma samples are located on the left pixel column of the group.
 matrix
 string  matrix =
The matrix used to convert the YUV pixels to computer RGB. Possible values are:
"601" ITU-R BT.601 / ITU-R BT.470-2 / SMPTE 170M. For Standard Definition content.
"709" ITU-R BT.709. For High Definition content.
"240" SMPTE 240M
"FCC" FCC
"YCgCo" YCgCo
When the parameter is not defined, "601" and "709" are automatically selected depending on the clip definition. If either the width is greater than 1024 or the height is greater than 576 then matrix defaults to "709", if equal to or less than, it defaults to "601".


 curve
 string  curve = "linear"
Type of gamma mapping (transfer characteristic) for gamma-aware resize (only take effects for Dither_resize16 processing parts).
"709" ITU-R BT.709 transfer curve for digital video
"601" ITU-R BT.601 transfer curve, same as "709"
"170" SMPTE 170M, same as "709"
"240" SMPTE 240M (1987)
"srgb" sRGB curve
"2020" ITU-R BT.2020 transfer curve, for 12-bit content. For sources of lower bitdepth, use the "709" curve.
"linear" linear curve without gamma-aware processing


 gcor
 float  gcor = 1.0
Gamma correction, applied on the linear part.


Examples

nnedi3_resize16 with default values (TODO):

AviSource("Blah.avi")
nnedi3_resize16()

Convert a 4:2:0 JPEG to RGB. Remember that most JPEGs use full range levels and the BT.601 color matrix. MPEG1 chroma placement is very common among 4:2:0 JPEGs while 4:2:2 JPEGs use the MPEG2 chroma placement.

JpegSource("420.jpg")
nnedi3_resize16(output="RGB32", tv_range=false, cplace="MPEG1", matrix="601")


External Links




Back to External Filters

Personal tools