DupStep

From Avisynth wiki
Revision as of 00:51, 26 June 2020 by Reel.Deal (Talk | contribs)

Jump to: navigation, search
Abstract
Author Orum
Version v0.03
Download DupStep_0.03.zip
Category Duplicate frame detectors
License GPLv3
Discussion Doom9 Forum


Contents

Description

DupStep is a duplicate frame detector and decimator for AviSynth+.


Requirements


*** vcredist_x64.exe is required for DupStep
  • An OpenCL platform & device (e.g. CPU/GPU) supporting OpenCL 1.2 or later


Syntax and Parameters

DupStep( \
  clip, float thresh = -1.0, int ifmcm = 0, int metric = 2, int blksize = -1, int planes = 2,  \
  int pco = -1, float Cweight = -1.0, bool ptp = true, string stats = "dupstep.dsd",           \
  string times = "", float toff = 0.0, int show = 0, oclpi = -2, ocldi = 0)

DS_dumpocl()   # Useful for adjusting 'oclpi' / 'ocldi' parameters, see description below
DS_cachefp(clip, stats = "dupstep.dsd")   # See 'ptp' and 'stats' below


Parameters
------------------
thresh (float):
  default value = -1.0 (auto)
    valid range = any

  This is the threshold DupStep uses to decide if a frame is a duplicate or not.  If thresh is
set to zero, only frames that are exact copies of one another will be detected as duplicates.
If thresh is set to a positive value, DupStep will use the value you specify as the threshold.

  If thresh is negative, DupStep will instead use an automatic default value, which is then
scaled (multiplied) by the absolute value of 'thresh'.  For example, setting 'thresh' to a value
of -2.0 or -0.5 will use twice or half of the automatic default value respectively.

  Changing a value, positive or negative, to be closer to zero will make it less likely for
frames to be detected as duplicates.  Doing the opposite (further from zero) makes DupStep more
tolerant of noise, but also makes it more likely a "real" frame will be accidentally detected as
a duplicate.

  "Clean" sources (i.e. those with little noise due to lossy encoding) may benefit from setting
this value closer to zero.  Noisy or low quality sources may want to set it further from zero.
For lossless sources, set 'thresh' to zero.

  Lastly, note that the automatic values are based on a number of other parameters, such as
'metric', 'planes', 'pco', and 'Cweight', as well as the bit-depth of the clip.  Changing those
parameters while using an automatic (i.e. negative) threshold will automatically adjust the
threshold accordingly.


ifmcm (int):
  default value = 0
    valid range = 0 to 2 (inclusive)

  This variable's name stands for "Inter Frame Metric Comparison Mode" and it allows you to set
how metrics are used to compare frames.  The various modes are described in the chart below:

   Value |   Type   | Description
  -------+----------+------------------------------------------
     0   |  Direct  | Direct frame <-> frame metric comparison
     1   | Indirect | Summed single-frame deltas
     2   | Indirect | Individual single-frame deltas

  Setting this to 0 is the most accurate mode, and has only a slight performance penalty when
changing parameters compared to later modes.  As such, it's the default and recommended mode.
In this mode, cache depth is dynamic and increases indefinitely until a non-duplicate frame is
found.

  The next mode, 1, uses metrics only of adjacent frames.  These metrics are summed until they
exceed the threshold.  It's less accurate than the direct mode, but once a cache is created no
further metrics need to be generated regardless of parameter changes (with a few exceptions).
It's more conservative than the last mode, but should be more useful when you have low temporal
noise or scenes with very slow changes (e.g. a static image fading to/from black).

  Finally, mode 2 is identical to mode 1, except that metrics are not summed and an adjacent
frame delta must exceed the threshold in order to be detected as a non-duplicate frame.  This
can lead to long segments with small changes being detected as duplicates and disappearing
entirely, so it should be used with caution.


metric (int):
  default value = 2 (SSBD)
    valid range = 0 to 4 (inclusive)

  This sets the metric DupStep uses to calculate a frame delta (the difference between two
frames).  The following table shows which method is used for a given metric value:

   Value | Metric | Type  | Description
  -------+--------+-------+-----------------------------------
     0   |  SSD   | Frame | Sum of Squared Differences
     1   |  SAD   | Frame | Sum of Absolute Differences
     2   |  SSBD  | Block | Sum of Squared Block Differences
     3   |  SABD  | Block | Sum of Absolute Block Differences
     4   |  PAD   | Pixel | Peak Absolute Difference

  The values 0 and 1 are "whole frame" metrics, as their values depend on changes across the
entire frame.  Metrics 2 and 3 are "block level" metrics, that are based on the block that
changed the most between two frames.  Finally, metric 4 is a "pixel" metric which looks only at
the single pixel that changed the most between the two frames.

  Metrics 2 and 4 are, based on my testing, the best at discerning duplicate frames, as they are
sensitive to small changes that only affect part of the entire frame.  Metric 2 (SSBD) is more
sensitive to low frequency temporal changes (e.g. a static image fading in/out), but as such
it's also more sensitive to low frequency temporal noise.  Metric 4 (PAD) should prove more
useful if this sensitivity to low frequency temporal changes is not needed or desired.


blksize (int):
  default value = -1 (auto)
    valid range = -1 to 3 (inclusive)

  Sets the block size used for block-based metrics.  Smaller blocks are more sensitive to small
changes, but also more sensitive to noise.  If set to -1, an automatic block size will be chosen
based on the resolution (specifically, the luma pixel count) of the clip.  It uses the following
formula (which is then clamped to the valid range):

                       2 * 2 ^ floor(log2(pixels) / 2 - 7.54185309632968)

  If set to a non-negative number, instead the block size is set manually to a value between 2
and 16 according to the following formula:

                                        2 * 2 ^ blksize

  If the block size is changed after metrics have been generated, the metric cache will be
declared invalid and will automatically be regenerated.  See the 'stats' parameter for more
information on the cache structure.

  When calculating metrics for video with subsampled chroma planes, the block dimensions (in
pixels) are adjusted in order to cover an area equal to the luma blocks (per block).  Also note
that if the video height or width is not divisible by 'blksize', any blocks that would normally
extend beyond the end of the frame will instead overlap other blocks in order to produce valid
metrics for the edge of the frame.


planes (int):
  default value = 2 (chroma & luma)
    valid range = 0 to 2 (inclusive)

  Sets which planes to process.  Set to 0 for luma only, 1 for chroma only, and 2 for both luma
and chroma.  Note that the cache will always generate metrics for all planes regardless of what
this is set to, which allows you to change this setting without having to regenerate the cache.

  Note that for mixed content (i.e. footage with some parts in color, and others in grayscale)
this should be set to 2 in order to process both luma and chroma planes, but 'pco' should be
adjusted.  See the 'pco' parameter for details.


pco (int):
  default value = -1 (auto)
    valid range = -1 to 2 (inclusive)

  The 'pco' parameter stands for "plane combining operation", and is ignored unless 'planes' is
set to 2 (chroma & luma).  It controls what mathematical operation is used to combine the
metrics from the luma & chroma planes, following the table below:

   Value | Operation      | Example    | Chosen when set to automatic (-1)...
  -------+----------------+------------+------------------------------------------
     0   | Sum            | Y   + UV   | For all 'metric' settings except PAD (4)
     1   | Sum of squares | Y^2 + UV^2 | Only when 'metric' is set to PAD (4)
     2   | Maximum        | max(Y, UV) | Never; recommended for mixed content

  This should be set to 2 whenever you have color content mixed with grayscale, so that chroma
planes are not ignored, while simultaneously preventing grayscale content from being penalized.
This should make setting an appropriate threshold easier, though adjusting the 'Cweight'
parameter may be necessary for optimal results.  See 'Cweight' documentation for details.


Cweight (float):
  default value = -1.0 (auto)
    valid range = any non-zero value

  This sets a scaling (weighting) value to use for chroma metrics.  It is intended to make
chroma planes more meaningful when determining if a frame is a duplicate.  Its value is ignored
unless 'planes' is set to 2 (chroma & luma).  If you wish to disable this functionality, set
'Cweight' to 1.0.

  If negative, an automatic default will be set depending on the 'metric' being used.  It is set
to 2.35 for absolute difference metrics (SAD/SABD/PAD) and 2.35^2 for squared metrics (SSD/SSBD)
after which it is scaled (multiplied) by the absolute value of 'Cweight'.

  Increasing the chroma weighting (setting it further from 0) may prove useful for clips with
relatively high noise in the luma plane but low noise in the chroma planes.  Decreasing it (i.e.
setting it closer to 0) may be helpful when the inverse is true.  Additionally, you may want to
change the 'pco' setting when adjusting this value; see that parameter for information.

  Finally, note that chroma weighting is applied before any mathematical operations performed by
the 'pco' setting take place.


ptp (bool):
  default value = true (enabled)

  This variable's name is an acronym that stands for "picture type preference".  When set to
true and used along with an appropriate source filter, it allows later frames in a string of
duplicates to be used as the reference frame if they are determined to be higher in priority,
with the thought that higher priority is assigned to higher quality frames.

  To determine priority, the filter looks for the FFPICT_TYPE exported variable, which may be
set depending on what source filter you are using.  Currently, the only filter I am aware of
that sets this variable is FFVideoSource().  It then assigns a priority value to the frame
according to the following table (the lower the number, the higher the priority):

   FFPICT_TYPE | Priority | Description
  -------------+----------+-----------------------------------------
      I / i    |    0     | Intra frame
      P / p    |    1     | Predicted frame
   B / b, S, ? |    2     | Bidirectional, Sprite, or unknown frame

  In theory, intra frames should be the highest quality frames, followed by 'P-frames', and
finally the lowest quality frames should be 'B-frames' and everything else.  If frame type
cannot be determined, it is automatically assumed to be of the lowest quality and therefore
assigned the same priority as a B-frame.

  If this variable is set to false or when using a source filter that does not set the
FFPICT_TYPE variable, DupStep will always return the first frame of a string of duplicates.
This should only be done if you are NOT using FFVideoSource().
 
  Note that if (spatio)temporal filtering happens between FFVideoSource() and DupStep(), there
is the potential for incorrect FFPICT_TYPE results to be fed to DupStep().  To allow for metrics
to be calculated after (spatio)temporal filtering, while still using valid data when 'ptp' is
enabled, the DS_cachefp() filter should be used.  To properly use this functionality, do the
following steps in order:

  1. Use FFVideoSource(), followed by any Trim(), UnalignedSplice() / +, AlignedSplice() / ++,
     etc. commands.  The frame count and order must not change between the end of this script
     and what is ultimately fed into DupStep().
  2. Append DS_cachefp() to the end of your script, ensuring that the 'stats' parameter does not
     refer to any existing cache file.
  3. Open the script.  If the frame priority cache creation completed successfully, you should
     see an error message stating that the "Frame priority cache [was] created."
  4. Edit your script, removing the DS_cachefp() filter and inserting any filtering you desire
     prior to DupStep(), followed by the DupStep() filter itself.
  5. Ensure 'stats' is set appropriately in DupStep() so that it refers to the cache file
     created by the DS_cachefp().  DupStep() will use the frame priority cache already generated
     while using the filtered video for metric calculation.


stats (string):
  default value = "dupstep.dsd"
    valid range = any valid file name or null

  This sets the file name where DupStep will save its frame priority cache and its frame delta
(metrics) cache, or alternatively where it will load the cache from if a previously written,
valid cache file exists.  If set to null, DupStep will not read or write the cache from/to disk,
and instead regenerate it every time.  For performance reasons as well as flexibility, I
STRONGLY recommend you do not do so.

  This cache file is split into two parts, the first being the frame priority cache.  It holds
frame priority values for use when 'ptp=true' (see that parameter for details).  This cache is
deemed valid as long as its cache version matches that of the of the DupStep filter being used
and its frame count matches that of the video fed into DupStep().  If determined to be invalid
or missing, the entire cache file will need to be (re)created, i.e. it will be automatically
(re)generated and (over)written.

  The second portion of the cache holds the frame deltas (AKA metrics) cache.  It is considered
usable as long as its pixel type matches the current video, its block size matches the current
'blksize' parameter (see 'blksize' for details), and the frame priority cache is valid.  If
those properties remain unchanged, a different clip can be used with the same cache, allowing
for advanced filtering, e.g. using a filtered clip to create the cache, but then applying the
decimation to an unfiltered clip.  When doing this, ensure 'ifmcm' is not set to 0 in order to
force all queries to be answered from the cache.

  If the metrics cache is invalid, but the frame priority cache is still valid, the priority
will remain unaltered while only the metrics are (over)written.  Also, when using the DupStep
filter in multiple scripts within a single directory, be sure to set a different 'stats' file
for each script, otherwise they will overwrite one another when opened or use incorrect values.


times (string):
  default value = "" (null)
    valid range = any valid file name or null

  This sets the file name to write VFR timestamps to.  If set to a non-null string, Matroska
timestamps will be written in the "v2" format.  Any existing file will be overwritten.  This
file should be muxed in with mkvmerge using the "--timestamps" option for proper VFR playback.


toff (float):
  default value = 0.0 (no offset)
    valid range = any

  This parameter sets an offset, in milliseconds, to apply as a global adjustment to all
timestamps that are written to the 'times' file (if set; see above).  It can be negative in
order to shift the times earlier, or positive to delay them all.


show (int):
  default value = 0
    valid range = 0 to 3 (inclusive)

  This parameter controls several settings with a single value by using a bit mask, similar to
the way chmod can be used to set *nix file permissions.  The least significant bit, 1, controls
the display of textual information on top of the frame, useful for adjusting the 'thresh'
parameter.

  The next highest bit, 2, controls whether or not DupStep decimates frames.  When set, frames
that would normally be decimated will instead be shown.  This is probably most useful when
combined with the lowest bit (sum them, i.e. set 'show=3') to display metric information for
frames that would otherwise disappear.


oclpi (int):
  default value = -2 (prefer CPU)
    valid range = -3 to system dependent maximum (inclusive)

  This sets the "OpenCL Platform Index" that the filter should use.  DupStep is limited to using
a single OpenCL platform and a single device on that platform (see 'ocldi' parameter) for memory
throughput reasons, and this allows you to select which platform that should be.

  When set to a non-negative value, this specifies the index of the chosen platform.  To see a
list of platforms along with a list of associated devices, use the DS_dumpocl() function within
an AviSynth+ script, and they will be listed using AviSynth+'s internal error reporting
mechanism.  The 'ocldi' parameter should also be set if you want to use a device beyond the
first one listed for the chosen platform.

  If this is set to a negative value, instead the platform index will be chosen automatically
according to the following table (note that CPUs/GPUs may not be found if they are lacking a
OpenCL-capable driver):

   Value | Setting    | Description
  -------+------------+------------------------------------------------------
    -4   | GPU only   | Consider only GPUs, error if none found
    -3   | Prefer GPU | Look first for GPUs, fall back to CPUs if none found
    -2   | Prefer CPU | Look first for CPUs, fall back to GPUs if none found
    -1   | CPU only   | Consider only CPUs, error if none found

  Note that the device indices ('ocldi' parameter) listed by DS_dumpocl() may not be valid when
using these automatic modes, so this parameter should always be set when raising the 'ocldi'
parameter.  Additionally, in order to use OpenCL devices that are neither CPUs nor GPUs, you
must manually set 'oclpi' to the platform index for the device as shown by DS_dumpocl().



ocldi (int):
  default value = 0 (first device on chosen platform)
    valid range = 0 to system dependent maximum (inclusive)

  This sets the "OpenCL Device Index" that DupStep should use for a given platform.  When set to
0, the first device for the selected platform will be used.  To get lists of device indices, use
the DS_dumpocl() function within an AviSynth+ script.  The filter can only use a single device;
however, it can use any number of cores on that device.

  In addition, you should always set the appropriate platform index ('oclpi' variable) when
adjusting this variable.  Platform indices are listed above the associated devices by the
DS_dumpocl() function.



Examples

More conservative threshold (frames must be more similar to be detected as a duplicate):
  DupStep(thresh=-0.5)

More liberal threshold (frames can be less similar and still be detected as a duplicate):
  DupStep(thresh=-2.0)

Recommended for full color content:
  DupStep(planes=2)   # By default planes=2, so doesn't need to be explicitly set

Recommended for mixed content (some grayscale, some color):
  DupStep(planes=2, pco=2)   # Combine luma/chroma metrics using a max() function

Recommended for grayscale only content:
  Grayscale()         # Doesn't need to be here, but should be done somewhere in your script
  DupStep(planes=0)   # Ignores chroma planes

May prove useful for full color content with moderate/high noise only in the luma plane:
  DupStep(planes=1)       # Completely ignores luma; use with caution!
  DupStep(Cweight=-2.0)   # Instead doubles the chroma weighting; consider using 'pco=2' too

Useful for tuning the 'thresh' parameter (eventually; not useful in the current version):
  DupStep(show=3)

Equivalent to ExactDedup(firstpass=false, _keeplastframe=true) (assuming first pass completed):
  DupStep(thresh=0.0, planes=2, times="times.txt")



Notes

0) This filter should only be used with source filters that feature frame-accurate seeking.
   Source filters that lack this feature may produce erroneous, non-deterministic results!

1) DupStep is not designed to be used within Animate(), ScriptClip(), or other runtime filters.
   It will almost certainly not do what you in them.  Use within those functions at your own
   risk!

2) When removing exact duplicates only ('thresh=0.0'), all settings except for 'planes' will not
   affect the frames decimated.  You may, however, find speed gains by adjusting 'blksize'; see
   note 3 below.

3) The automatic (and default) setting for the 'blksize' parameter prioritizes quality over
   speed.  If you are not using block-based metrics (SSBD or SABD) you may want to adjust this
   parameter to whatever proves fastest for your system.

4) A perl script, "dsstats.pl" is included which dumps the binary information within the cache
   to stdout in a text CSV format.  Note that these are intermediate values and should not be
   used to set the 'thresh' parameter without additional calculation in almost all cases.  Run
   "dsstats.pl -h" to see help regarding the usage of that script.

5) In DupStep v0.02 and earlier, other metrics were available or bore the same name but behaved
   differently.  To get equivalent metrics in the current version, consult the chart below:

   Old Metric | New Settings
  ------------+------------------------------------------------------------------------------
      RSAD    | metric=1, pco=0; if old 'thresh' was positive, divide it by luma pixel count
      RSSD    | metric=0, pco=0; if old 'thresh' was positive, divide it by luma pixel count
      SAD     | metric=1, pco=0
      SSD     | metric=0, pco=0
      PAD     | metric=4, pco=2; if old 'thresh' was negative, multiply it by 1.25
      SSPD    | metric=4, pco=1, planes=2

   Note, that the default settings for 'Cweight' have also changed from v0.02.  While the old
   values could be calculated, I recommend against doing so and simply readjusting based on the
   new default.



Known Issues

0) The OpenCL kernels will not compile for the Intel platform, either CPU or GPU, as that
   platform does not support atomic operations on 64-bit integers.  A workaround will be added
   in the future, but AMD and nVidia platforms work now as long as they support OpenCL 1.2.

Finally, this filter is currently in an 'alpha' state.  It's likely riddled with bugs, and may
cause your computer to spontaneously burst into flames.  Please report issues to the Doom9
thread (link below) or the GitHub issue tracker (via GitHub page listed below) so that they can
be fixed in future releases.



Changelog

v0.03
----------------
Speed: Cache depth is now dynamic and generates/caches only what is needed for the current
  parameters/video
Speed: Cache data from file is always used when available and only recalculated when necessary
  (exception: changing block size recalculates all metrics)
Speed: OpenCL kernel should now be able to take advantage of SIMD instructions for platforms
  that support them
Speed: Avoids unnecessary memory copying when executing OpenCL kernels on a CPU
Added: New function "DS_cachefp()" to cache only frame priorities, allowing 'ptp' to be used
  even when (spatio)temporal filtering is used prior to metric generation
Param: Removed the 'cdepth' parameter as cache depth is dynamic on a per-frame basis now
Param: 'ifmcm' modes 1 and 2 removed and replaced by former modes 3 & 4
Param: Default mode for 'ifmcm' set to 0 (direct)
Param: Removed "raw" metrics (RSAD/RSSD)
Param: New 'metric' settings for block-based metrics
Param: Reordered the 'metric' settings to put squared metrics at lower values than their
  absolute counterparts
Param: Default 'metric' now set to SSBD (2)
Param: Automatic (negative) 'Cweight' uses new values based on empirical data, and no longer
  considers chroma subsampling factor as a factor
Param: New parameter 'pco' to set the plane combination operation
Param: Added 'blksize' parameter to set the block size for block metrics
Param: Added new option to prefer CPU to 'oclpi'; subtract 1 from old options if < -1 to get the
  equivalent new option index for them
Param: Default for 'oclpi' is now set to prefer CPU (-2)
Param: Reordered the parameters
Fixed: Potential overflow when calculating chroma metrics has been fixed
Fixed: A potential race condition that could lead to incorrect results was resolved
Fixed: Corrected a typo in dsstats.pl help (-h)
Fixed: Multiple known issues from v0.02 resolved
Other: Metrics are now all stored as floating point values, pre-normalized to resolution where
  appropriate (previously they were post-normalized, which could cause problems when resolution
  changed between cache generation and usage)
Other: Pixel metrics (SSPD/PAD) are no longer clamped after being scaled by chroma weighting,
  permitting more aggressive chroma weighting
Other: Version bump for cache file; there is no compatibility with caches from older versions
Other: OpenCL device memory usage reduced as it only ever needs to hold the two compared frames
Other: DS_dumpocl() now displays OpenCL version supported (both for platforms and devices)
Other: Significant refactoring


v0.02
----------------
Speed: OpenCL is now used for metric calculation, allowing for multi-threading on a CPU or using
  a GPU instead
Speed: All frames read during cache generation are now cached on the OpenCL device so they are
  only requested once from AviSynth
Added: New function "DS_dumpocl()" to list OpenCL platforms and devices with index
Param: Added parameters 'oclpi' and 'ocldi'
Param: Default 'cdepth' changed from 2 to 5
Param: Maximum value of 'cdepth' changed from 10 to 250
Param: dsstats.pl now features additional parameters; run "dsstats.pl -h" for information
Fixed: SSD metrics could be incorrect (lower than they should be) for 16-bit video
Fixed: Cache depth promotion was not working in some cases (affected ifmcm modes 1 & 2)
Fixed: Frame type cache was leaking a small amount of memory
Fixed: Cache was delivering incorrect frame deltas when within a radius of 'cdepth' of the final
  video frame, sometimes reading outside of buffer memory (undefined values)
Fixed: Cache had incorrect frame types when cdepth > 1
Other: Documentation updated regarding when 'ptp' should be disabled, and detailed new concerns
  regarding device memory usage during cache generation
Other: dsstats.pl now validates cache version
Other: Minor refactoring


v0.01
----------------
Initial release



External Links

  • GitHub - Source code repository.




Back to External Filters

Personal tools