From Avisynth wiki
Revision as of 21:27, 27 July 2021 by Reel.Deal
|Category||Duplicate frame detectors|
DupStep is a duplicate frame detector and decimator for AviSynth+.
- AviSynth+ (x64 only)
- Supported color formats: Y8, YV12, YV16, YV24 ... (8 to 16 bit planar YUV formats)
- *** vcredist_x64.exe is required for DupStep
- An OpenCL platform & device (e.g. CPU/GPU) supporting OpenCL 1.2 or later
DupStep( \ clip, float thresh = -1.0, int ifmcm = 0, int metric = 2, int blksize = -1, int planes = 2, \ int pco = -1, float Cweight = -1.0, bool ptp = true, string stats = "dupstep.dsd", \ string times = "", float toff = 0.0, int show = 0, oclpi = -2, ocldi = 0) DS_dumpocl() # Useful for adjusting 'oclpi' / 'ocldi' parameters, see description below DS_cachefp(clip, stats = "dupstep.dsd") # See 'ptp' and 'stats' below Parameters ------------------ thresh (float): default value = -1.0 (auto) valid range = any This is the threshold DupStep uses to decide if a frame is a duplicate or not. If thresh is set to zero, only frames that are exact copies of one another will be detected as duplicates. If thresh is set to a positive value, DupStep will use the value you specify as the threshold. If thresh is negative, DupStep will instead use an automatic default value, which is then scaled (multiplied) by the absolute value of 'thresh'. For example, setting 'thresh' to a value of -2.0 or -0.5 will use twice or half of the automatic default value respectively. Changing a value, positive or negative, to be closer to zero will make it less likely for frames to be detected as duplicates. Doing the opposite (further from zero) makes DupStep more tolerant of noise, but also makes it more likely a "real" frame will be accidentally detected as a duplicate. "Clean" sources (i.e. those with little noise due to lossy encoding) may benefit from setting this value closer to zero. Noisy or low quality sources may want to set it further from zero. For lossless sources, set 'thresh' to zero. Lastly, note that the automatic values are based on a number of other parameters, such as 'metric', 'planes', 'pco', and 'Cweight', as well as the bit-depth of the clip. Changing those parameters while using an automatic (i.e. negative) threshold will automatically adjust the threshold accordingly. ifmcm (int): default value = 0 valid range = 0 to 2 (inclusive) This variable's name stands for "Inter Frame Metric Comparison Mode" and it allows you to set how metrics are used to compare frames. The various modes are described in the chart below: Value | Type | Description -------+----------+------------------------------------------ 0 | Direct | Direct frame <-> frame metric comparison 1 | Indirect | Summed single-frame deltas 2 | Indirect | Individual single-frame deltas Setting this to 0 is the most accurate mode, and has only a slight performance penalty when changing parameters compared to later modes. As such, it's the default and recommended mode. In this mode, cache depth is dynamic and increases indefinitely until a non-duplicate frame is found. The next mode, 1, uses metrics only of adjacent frames. These metrics are summed until they exceed the threshold. It's less accurate than the direct mode, but once a cache is created no further metrics need to be generated regardless of parameter changes (with a few exceptions). It's more conservative than the last mode, but should be more useful when you have low temporal noise or scenes with very slow changes (e.g. a static image fading to/from black). Finally, mode 2 is identical to mode 1, except that metrics are not summed and an adjacent frame delta must exceed the threshold in order to be detected as a non-duplicate frame. This can lead to long segments with small changes being detected as duplicates and disappearing entirely, so it should be used with caution. metric (int): default value = 2 (SSBD) valid range = 0 to 4 (inclusive) This sets the metric DupStep uses to calculate a frame delta (the difference between two frames). The following table shows which method is used for a given metric value: Value | Metric | Type | Description -------+--------+-------+----------------------------------- 0 | SSD | Frame | Sum of Squared Differences 1 | SAD | Frame | Sum of Absolute Differences 2 | SSBD | Block | Sum of Squared Block Differences 3 | SABD | Block | Sum of Absolute Block Differences 4 | PAD | Pixel | Peak Absolute Difference The values 0 and 1 are "whole frame" metrics, as their values depend on changes across the entire frame. Metrics 2 and 3 are "block level" metrics, that are based on the block that changed the most between two frames. Finally, metric 4 is a "pixel" metric which looks only at the single pixel that changed the most between the two frames. Metrics 2 and 4 are, based on my testing, the best at discerning duplicate frames, as they are sensitive to small changes that only affect part of the entire frame. Metric 2 (SSBD) is more sensitive to low frequency temporal changes (e.g. a static image fading in/out), but as such it's also more sensitive to low frequency temporal noise. Metric 4 (PAD) should prove more useful if this sensitivity to low frequency temporal changes is not needed or desired. blksize (int): default value = -1 (auto) valid range = -1 to 3 (inclusive) Sets the block size used for block-based metrics. Smaller blocks are more sensitive to small changes, but also more sensitive to noise. If set to -1, an automatic block size will be chosen based on the resolution (specifically, the luma pixel count) of the clip. It uses the following formula (which is then clamped to the valid range): 2 * 2 ^ floor(log2(pixels) / 2 - 7.54185309632968) If set to a non-negative number, instead the block size is set manually to a value between 2 and 16 according to the following formula: 2 * 2 ^ blksize If the block size is changed after metrics have been generated, the metric cache will be declared invalid and will automatically be regenerated. See the 'stats' parameter for more information on the cache structure. When calculating metrics for video with subsampled chroma planes, the block dimensions (in pixels) are adjusted in order to cover an area equal to the luma blocks (per block). Also note that if the video height or width is not divisible by 'blksize', any blocks that would normally extend beyond the end of the frame will instead overlap other blocks in order to produce valid metrics for the edge of the frame. planes (int): default value = 2 (chroma & luma) valid range = 0 to 2 (inclusive) Sets which planes to process. Set to 0 for luma only, 1 for chroma only, and 2 for both luma and chroma. Note that the cache will always generate metrics for all planes regardless of what this is set to, which allows you to change this setting without having to regenerate the cache. Note that for mixed content (i.e. footage with some parts in color, and others in grayscale) this should be set to 2 in order to process both luma and chroma planes, but 'pco' should be adjusted. See the 'pco' parameter for details. pco (int): default value = -1 (auto) valid range = -1 to 2 (inclusive) The 'pco' parameter stands for "plane combining operation", and is ignored unless 'planes' is set to 2 (chroma & luma). It controls what mathematical operation is used to combine the metrics from the luma & chroma planes, following the table below: Value | Operation | Example | Chosen when set to automatic (-1)... -------+----------------+------------+------------------------------------------ 0 | Sum | Y + UV | For all 'metric' settings except PAD (4) 1 | Sum of squares | Y^2 + UV^2 | Only when 'metric' is set to PAD (4) 2 | Maximum | max(Y, UV) | Never; recommended for mixed content This should be set to 2 whenever you have color content mixed with grayscale, so that chroma planes are not ignored, while simultaneously preventing grayscale content from being penalized. This should make setting an appropriate threshold easier, though adjusting the 'Cweight' parameter may be necessary for optimal results. See 'Cweight' documentation for details. Cweight (float): default value = -1.0 (auto) valid range = any non-zero value This sets a scaling (weighting) value to use for chroma metrics. It is intended to make chroma planes more meaningful when determining if a frame is a duplicate. Its value is ignored unless 'planes' is set to 2 (chroma & luma). If you wish to disable this functionality, set 'Cweight' to 1.0. If negative, an automatic default will be set depending on the 'metric' being used. It is set to 2.35 for absolute difference metrics (SAD/SABD/PAD) and 2.35^2 for squared metrics (SSD/SSBD) after which it is scaled (multiplied) by the absolute value of 'Cweight'. Increasing the chroma weighting (setting it further from 0) may prove useful for clips with relatively high noise in the luma plane but low noise in the chroma planes. Decreasing it (i.e. setting it closer to 0) may be helpful when the inverse is true. Additionally, you may want to change the 'pco' setting when adjusting this value; see that parameter for information. Finally, note that chroma weighting is applied before any mathematical operations performed by the 'pco' setting take place. ptp (bool): default value = true (enabled) This variable's name is an acronym that stands for "picture type preference". When set to true and used along with an appropriate source filter, it allows later frames in a string of duplicates to be used as the reference frame if they are determined to be higher in priority, with the thought that higher priority is assigned to higher quality frames. To determine priority, the filter looks for the FFPICT_TYPE exported variable, which may be set depending on what source filter you are using. Currently, the only filter I am aware of that sets this variable is FFVideoSource(). It then assigns a priority value to the frame according to the following table (the lower the number, the higher the priority): FFPICT_TYPE | Priority | Description -------------+----------+----------------------------------------- I / i | 0 | Intra frame P / p | 1 | Predicted frame B / b, S, ? | 2 | Bidirectional, Sprite, or unknown frame In theory, intra frames should be the highest quality frames, followed by 'P-frames', and finally the lowest quality frames should be 'B-frames' and everything else. If frame type cannot be determined, it is automatically assumed to be of the lowest quality and therefore assigned the same priority as a B-frame. If this variable is set to false or when using a source filter that does not set the FFPICT_TYPE variable, DupStep will always return the first frame of a string of duplicates. This should only be done if you are NOT using FFVideoSource(). Note that if (spatio)temporal filtering happens between FFVideoSource() and DupStep(), there is the potential for incorrect FFPICT_TYPE results to be fed to DupStep(). To allow for metrics to be calculated after (spatio)temporal filtering, while still using valid data when 'ptp' is enabled, the DS_cachefp() filter should be used. To properly use this functionality, do the following steps in order: 1. Use FFVideoSource(), followed by any Trim(), UnalignedSplice() / +, AlignedSplice() / ++, etc. commands. The frame count and order must not change between the end of this script and what is ultimately fed into DupStep(). 2. Append DS_cachefp() to the end of your script, ensuring that the 'stats' parameter does not refer to any existing cache file. 3. Open the script. If the frame priority cache creation completed successfully, you should see an error message stating that the "Frame priority cache [was] created." 4. Edit your script, removing the DS_cachefp() filter and inserting any filtering you desire prior to DupStep(), followed by the DupStep() filter itself. 5. Ensure 'stats' is set appropriately in DupStep() so that it refers to the cache file created by the DS_cachefp(). DupStep() will use the frame priority cache already generated while using the filtered video for metric calculation. stats (string): default value = "dupstep.dsd" valid range = any valid file name or null This sets the file name where DupStep will save its frame priority cache and its frame delta (metrics) cache, or alternatively where it will load the cache from if a previously written, valid cache file exists. If set to null, DupStep will not read or write the cache from/to disk, and instead regenerate it every time. For performance reasons as well as flexibility, I STRONGLY recommend you do not do so. This cache file is split into two parts, the first being the frame priority cache. It holds frame priority values for use when 'ptp=true' (see that parameter for details). This cache is deemed valid as long as its cache version matches that of the of the DupStep filter being used and its frame count matches that of the video fed into DupStep(). If determined to be invalid or missing, the entire cache file will need to be (re)created, i.e. it will be automatically (re)generated and (over)written. The second portion of the cache holds the frame deltas (AKA metrics) cache. It is considered usable as long as its pixel type matches the current video, its block size matches the current 'blksize' parameter (see 'blksize' for details), and the frame priority cache is valid. If those properties remain unchanged, a different clip can be used with the same cache, allowing for advanced filtering, e.g. using a filtered clip to create the cache, but then applying the decimation to an unfiltered clip. When doing this, ensure 'ifmcm' is not set to 0 in order to force all queries to be answered from the cache. If the metrics cache is invalid, but the frame priority cache is still valid, the priority will remain unaltered while only the metrics are (over)written. Also, when using the DupStep filter in multiple scripts within a single directory, be sure to set a different 'stats' file for each script, otherwise they will overwrite one another when opened or use incorrect values. times (string): default value = "" (null) valid range = any valid file name or null This sets the file name to write VFR timestamps to. If set to a non-null string, Matroska timestamps will be written in the "v2" format. Any existing file will be overwritten. This file should be muxed in with mkvmerge using the "--timestamps" option for proper VFR playback. toff (float): default value = 0.0 (no offset) valid range = any This parameter sets an offset, in milliseconds, to apply as a global adjustment to all timestamps that are written to the 'times' file (if set; see above). It can be negative in order to shift the times earlier, or positive to delay them all. show (int): default value = 0 valid range = 0 to 3 (inclusive) This parameter controls several settings with a single value by using a bit mask, similar to the way chmod can be used to set *nix file permissions. The least significant bit, 1, controls the display of textual information on top of the frame, useful for adjusting the 'thresh' parameter. The next highest bit, 2, controls whether or not DupStep decimates frames. When set, frames that would normally be decimated will instead be shown. This is probably most useful when combined with the lowest bit (sum them, i.e. set 'show=3') to display metric information for frames that would otherwise disappear. oclpi (int): default value = -2 (prefer CPU) valid range = -3 to system dependent maximum (inclusive) This sets the "OpenCL Platform Index" that the filter should use. DupStep is limited to using a single OpenCL platform and a single device on that platform (see 'ocldi' parameter) for memory throughput reasons, and this allows you to select which platform that should be. When set to a non-negative value, this specifies the index of the chosen platform. To see a list of platforms along with a list of associated devices, use the DS_dumpocl() function within an AviSynth+ script, and they will be listed using AviSynth+'s internal error reporting mechanism. The 'ocldi' parameter should also be set if you want to use a device beyond the first one listed for the chosen platform. If this is set to a negative value, instead the platform index will be chosen automatically according to the following table (note that CPUs/GPUs may not be found if they are lacking a OpenCL-capable driver): Value | Setting | Description -------+------------+------------------------------------------------------ -4 | GPU only | Consider only GPUs, error if none found -3 | Prefer GPU | Look first for GPUs, fall back to CPUs if none found -2 | Prefer CPU | Look first for CPUs, fall back to GPUs if none found -1 | CPU only | Consider only CPUs, error if none found Note that the device indices ('ocldi' parameter) listed by DS_dumpocl() may not be valid when using these automatic modes, so this parameter should always be set when raising the 'ocldi' parameter. Additionally, in order to use OpenCL devices that are neither CPUs nor GPUs, you must manually set 'oclpi' to the platform index for the device as shown by DS_dumpocl(). ocldi (int): default value = 0 (first device on chosen platform) valid range = 0 to system dependent maximum (inclusive) This sets the "OpenCL Device Index" that DupStep should use for a given platform. When set to 0, the first device for the selected platform will be used. To get lists of device indices, use the DS_dumpocl() function within an AviSynth+ script. The filter can only use a single device; however, it can use any number of cores on that device. In addition, you should always set the appropriate platform index ('oclpi' variable) when adjusting this variable. Platform indices are listed above the associated devices by the DS_dumpocl() function.
More conservative threshold (frames must be more similar to be detected as a duplicate): DupStep(thresh=-0.5) More liberal threshold (frames can be less similar and still be detected as a duplicate): DupStep(thresh=-2.0) Recommended for full color content: DupStep(planes=2) # By default planes=2, so doesn't need to be explicitly set Recommended for mixed content (some grayscale, some color): DupStep(planes=2, pco=2) # Combine luma/chroma metrics using a max() function Recommended for grayscale only content: Grayscale() # Doesn't need to be here, but should be done somewhere in your script DupStep(planes=0) # Ignores chroma planes May prove useful for full color content with moderate/high noise only in the luma plane: DupStep(planes=1) # Completely ignores luma; use with caution! DupStep(Cweight=-2.0) # Instead doubles the chroma weighting; consider using 'pco=2' too Useful for tuning the 'thresh' parameter (eventually; not useful in the current version): DupStep(show=3) Equivalent to ExactDedup(firstpass=false, _keeplastframe=true) (assuming first pass completed): DupStep(thresh=0.0, planes=2, times="times.txt")
0) This filter should only be used with source filters that feature frame-accurate seeking. Source filters that lack this feature may produce erroneous, non-deterministic results! 1) DupStep is not designed to be used within Animate(), ScriptClip(), or other runtime filters. It will almost certainly not do what you in them. Use within those functions at your own risk! 2) When removing exact duplicates only ('thresh=0.0'), all settings except for 'planes' will not affect the frames decimated. You may, however, find speed gains by adjusting 'blksize'; see note 3 below. 3) The automatic (and default) setting for the 'blksize' parameter prioritizes quality over speed. If you are not using block-based metrics (SSBD or SABD) you may want to adjust this parameter to whatever proves fastest for your system. 4) A perl script, "dsstats.pl" is included which dumps the binary information within the cache to stdout in a text CSV format. Note that these are intermediate values and should not be used to set the 'thresh' parameter without additional calculation in almost all cases. Run "dsstats.pl -h" to see help regarding the usage of that script. 5) In DupStep v0.02 and earlier, other metrics were available or bore the same name but behaved differently. To get equivalent metrics in the current version, consult the chart below: Old Metric | New Settings ------------+------------------------------------------------------------------------------ RSAD | metric=1, pco=0; if old 'thresh' was positive, divide it by luma pixel count RSSD | metric=0, pco=0; if old 'thresh' was positive, divide it by luma pixel count SAD | metric=1, pco=0 SSD | metric=0, pco=0 PAD | metric=4, pco=2; if old 'thresh' was negative, multiply it by 1.25 SSPD | metric=4, pco=1, planes=2 Note, that the default settings for 'Cweight' have also changed from v0.02. While the old values could be calculated, I recommend against doing so and simply readjusting based on the new default.
0) The OpenCL kernels will not compile for the Intel platform, either CPU or GPU, as that platform does not support atomic operations on 64-bit integers. A workaround will be added in the future, but AMD and nVidia platforms work now as long as they support OpenCL 1.2. Finally, this filter is currently in an 'alpha' state. It's likely riddled with bugs, and may cause your computer to spontaneously burst into flames. Please report issues to the Doom9 thread (link below) or the GitHub issue tracker (via GitHub page listed below) so that they can be fixed in future releases.
v0.03 ---------------- Speed: Cache depth is now dynamic and generates/caches only what is needed for the current parameters/video Speed: Cache data from file is always used when available and only recalculated when necessary (exception: changing block size recalculates all metrics) Speed: OpenCL kernel should now be able to take advantage of SIMD instructions for platforms that support them Speed: Avoids unnecessary memory copying when executing OpenCL kernels on a CPU Added: New function "DS_cachefp()" to cache only frame priorities, allowing 'ptp' to be used even when (spatio)temporal filtering is used prior to metric generation Param: Removed the 'cdepth' parameter as cache depth is dynamic on a per-frame basis now Param: 'ifmcm' modes 1 and 2 removed and replaced by former modes 3 & 4 Param: Default mode for 'ifmcm' set to 0 (direct) Param: Removed "raw" metrics (RSAD/RSSD) Param: New 'metric' settings for block-based metrics Param: Reordered the 'metric' settings to put squared metrics at lower values than their absolute counterparts Param: Default 'metric' now set to SSBD (2) Param: Automatic (negative) 'Cweight' uses new values based on empirical data, and no longer considers chroma subsampling factor as a factor Param: New parameter 'pco' to set the plane combination operation Param: Added 'blksize' parameter to set the block size for block metrics Param: Added new option to prefer CPU to 'oclpi'; subtract 1 from old options if < -1 to get the equivalent new option index for them Param: Default for 'oclpi' is now set to prefer CPU (-2) Param: Reordered the parameters Fixed: Potential overflow when calculating chroma metrics has been fixed Fixed: A potential race condition that could lead to incorrect results was resolved Fixed: Corrected a typo in dsstats.pl help (-h) Fixed: Multiple known issues from v0.02 resolved Other: Metrics are now all stored as floating point values, pre-normalized to resolution where appropriate (previously they were post-normalized, which could cause problems when resolution changed between cache generation and usage) Other: Pixel metrics (SSPD/PAD) are no longer clamped after being scaled by chroma weighting, permitting more aggressive chroma weighting Other: Version bump for cache file; there is no compatibility with caches from older versions Other: OpenCL device memory usage reduced as it only ever needs to hold the two compared frames Other: DS_dumpocl() now displays OpenCL version supported (both for platforms and devices) Other: Significant refactoring v0.02 ---------------- Speed: OpenCL is now used for metric calculation, allowing for multi-threading on a CPU or using a GPU instead Speed: All frames read during cache generation are now cached on the OpenCL device so they are only requested once from AviSynth Added: New function "DS_dumpocl()" to list OpenCL platforms and devices with index Param: Added parameters 'oclpi' and 'ocldi' Param: Default 'cdepth' changed from 2 to 5 Param: Maximum value of 'cdepth' changed from 10 to 250 Param: dsstats.pl now features additional parameters; run "dsstats.pl -h" for information Fixed: SSD metrics could be incorrect (lower than they should be) for 16-bit video Fixed: Cache depth promotion was not working in some cases (affected ifmcm modes 1 & 2) Fixed: Frame type cache was leaking a small amount of memory Fixed: Cache was delivering incorrect frame deltas when within a radius of 'cdepth' of the final video frame, sometimes reading outside of buffer memory (undefined values) Fixed: Cache had incorrect frame types when cdepth > 1 Other: Documentation updated regarding when 'ptp' should be disabled, and detailed new concerns regarding device memory usage during cache generation Other: dsstats.pl now validates cache version Other: Minor refactoring v0.01 ---------------- Initial release
- GitHub - Source code repository.
Back to External Filters ←