Interlace detection

From Avisynth wiki
(Difference between revisions)
Jump to: navigation, search
 
m (1 revision)

Revision as of 22:49, 9 May 2013

Automated interlace detection is an algorithm for determining the combing patterns in a given source. It can recognize progressive, interlaced, telecined sources, and hybrids thereof. For telecined and interlaced sources, it can also determine the field order. For progressive sources, it can recognise if the source has had the framerate upconverted by simple repitition (eg 24p->60p). berrinam's blog on source detection can be found at [1]

The algorithm is written by berrinam, based on descriptions of AutoGK's algorithm, and is accessible as part of MeGUI's AviSynth Script Creator and also as a stand-alone application [2]. However, the stand-alone application is not currently up-to-date with MeGUI's algorithm, because the integration into MeGUI makes it much easier to use.

Contents

Where to get it

For most people, automated interlace detection is of most use in MeGUI. When running MeGUI, you go to the AviSynth Script Creator in the tools menu, and you can tell it to 'Analyse' after you have selected an input file. This will do one or two passes through about 1% of the video and then tell you its conclusions, as well as recommending some filters and inserting the filters into your script if you want them.

For developers, a description of the algorithm is on this page, and the source code for the actual implementation can be found either in the stand-alone application linked to above, or in MeGUI's source code, which can be accessed via anonymous CVS [3]. The relevant code is in ScriptServer.cs and SourceDetection.cs.

Benefits and Limitations

Benefits

  • The code is under the GPL, so anyone can use the algorithm.
  • It can run very fast, as it normally analyses 1% of the video.
  • The thresholds and analysis percent can be controlled by the user to allow tweaking.
  • Unlike AutoGK, it ignores the input framerate, which means that the entire analysis is truly based on the video itself.

Limitations

(Note: MeGUI now has a checkbox for anime sources which should be checked before the analysis)

Due to the nature of the algorithm, it has problems with anime-style sources. As anime often has many repeated frames, and this algorithm ignores sections with duplicate frames, the analysis of anime has severely limited amounts of information to deal with. Also, there is a possibility that anime should be treated very differently, because anime, unlike live action, is inherently progressive.

This currently won't recognize a failure (ie, a wrongly suggested filter). This is a diversion from the human-like approach that the rest of the program takes, and it also means that the user will not be warned if anything goes wrong. Based on the description of foxyshadis's approach to anime processing, I have another idea for (additional) testing. See the TFM-based Testing section for details.

There are some less common cases that this detection is unaware of. These include:

  • Field-blending
  • Phase shift, causing it to appear interlaced while actually being progressive. This shouldn't be much of a problem because the deinterlacers will realize this. However, a solution is being considered; see the section on TFM-based Testing for details.

Hopefully, however, the extent of these limitations is small, because

  • There are not very many problem sources like this
  • The filters are often adaptive enough to avoid these problems. For example, if the source was progressive with a phase shift, interlace detection would still pick up the correct field order, and then the deinterlacer, when matching the relevant fields together should realize that the source is progressive.
  • In the case of field-blending, the source is stuffed no matter what, so it really needs an expert to fix it up.

Algorithm

The algorithm for this detection works in multiple steps:

  1. The file is analysed in AviSynth to find frames which are combed, ignoring sections in which there is no movement
  2. These results are processed in 5-frame groups to detect combing patterns which can either be progressive, interlaced, or telecined
  3. If there is a significant non-progressive section, the algorithm analyses the script again in AviSynth to determine the field order.
  4. These results are once again processed and thresholded to avoid random variations.
  5. Based on all of these results, a description of the source is generated, and various filter-chains for dealing with this source are recommended.

Initial AviSynth analysis

The initial analysis is done in AviSynth to determine the type of source. It runs with the following script template:

<input>
global unused_ = blankclip(pixel_type="yv12", length=10).TFM()
file="<temporary-file.log>"
global sep="-"
function IsMoving()
{
  global b = (diff < 1.0) ? false : true
}
c = SelectRangeEvery(every=<every>,length=<length>)
global clip = c
c = WriteFile(c, file, "a", "sep", "b")
c = FrameEvaluate(c, "global a = IsCombedTIVTC(clip, cthresh=9)")
c = FrameEvaluate(c, "IsMoving")
c = FrameEvaluate(c,"global diff = 0.50*YDifferenceFromPrevious(clip) + 0.25*UDifferenceFromPrevious(clip) + 0.25*VDifferenceFromPrevious(clip)")
crop(c,0,0,16,16)

This script obviously has <input> substituted with the input script, <temporary-file.log> with a temporary file, and <every> and <length> filled in with parameters.

The script works by choosing a small sample of the file from across the movie (the SelectRangeEvery call). These frames are then checked for combing using the IsCombed function and are checked for motion using the Y/U/V DifferenceFromPrevious functions with a threshold. All these results are written to the temporary file specified at the beginning of the clip.

The line,

global unused_ = blankclip(pixel_type="yv12", length=10).TFM()

seems a bit wasteful, and the variable is indeed labelled unused. Manao recommended this line, so that this script would return an AviSynth error if it didn't have access to the TIVTC package (which both TFM() and IsCombedTIVTC() are part of). Previously, one would think that everything went according to plan, but the temporary logfile would be full of "I don't know what "a" means".

These results are written to file, so that the program can analyse them in the next step.

Interpreting initial analysis results

The results from the above step are interpreted by automated interlace detection in groups of 5 frames at a time. This means that we can make a pretty good estimate for what type of section these frames are by doing the following

  1. Check if all 5 frames are moving. If they aren't, ignore them, because if there isn't any motion, we're unlikely to have combing anyway. If we didn't do this, we might get too many false progressive sections.
  2. Increment the tally which keeps track of how many sections have each of 0, 1, 2, 3, 4 or 5 moving frames in them. We do this to recognise repitition-upconverted sources.
  3. Count the number of combed frames in this group of 5. If it is 0, then declare that section progressive; if it is 2, then declare it telecined; if it is anything else, declare it interlaced.

After having done that counting on every group of 5 frames we analysed, add the number of each type up and threshold. The 'hybrid threshold' in the current versions of the program refers to the constant used in the following threshold:

IsHybrid = (MostCommon > HybridThreshold * SecondMostCommon)

If the source is hybrid, it goes through cases to determine which of the following types it is:

  • Hybrid film/interlaced, majority film
  • Hybrid film/interlaced, majority interlaced
  • Hybrid film/progressive.
  • Hybrid interlaced/progressive.

With all of these, as there is some combing in the film, interlace detection then runs the field order script below to determine the field order.

If the source is not hybrid, it is determined to be any of:

  • Progressive
  • Interlaced
  • Telecined
  • Repitition-upconverted (needing decimation)

If it is determined to be interlaced or telecined, it is passed on to the field order script. If it is progressive, then it tells the user that it is already progressive and nothing needs to be done to it. If it is repitition-upconverted, it recommends a correctly-configured TDecimate filter.

AviSynth field-order script

The following script template is used to determine field order:

<input>
file="<temporary-file.log>"
global sep="-"
d = SelectRangeEvery(every=<every>,length=<length>,0)
global abff = d.assumebff().separatefields()
global atff = d.assumetff().separatefields()
c = d.loop(2)
c = WriteFile(c, file, "diffa", "sep", "diffb")
c = FrameEvaluate(c,"global diffa = 0.50*YDifferenceFromPrevious(abff) + 0.25*UDifferenceFromPrevious(abff) + 0.25*VDifferenceFromPrevious(abff)")
c = FrameEvaluate(c,"global diffb = 0.50*YDifferenceFromPrevious(atff) + 0.25*UDifferenceFromPrevious(atff) + 0.25*VDifferenceFromPrevious(atff)")
crop(c,0,0,16,16)

Once again, <input> is replaced by the input script, <temporary-file.log> by a temporary file, <every> and <length> by the respective SelectRangeEvery parameters.

This script works by separating the fields with top field first (TFF) and bottom field first (BFF). These clips are stored in atff and abff respectively. Then, the motion from frame to frame is measured in each clip, and written to the temporary file, which is then analysed in the next step.

Interpreting field order results

These results are much easier to interpret. A common set of values to find would look like this:

12.546413-4.224947
8.449630-8.449630
11.777706-3.561164
8.115766-8.115766
11.782013-3.283536
7.932224-7.932224
11.577293-3.436398
7.903222-7.903222
11.473661-3.679578
7.820997-7.820997

Every second set of values are exactly equal, and the other ones have one (in this case, the left column) much larger than the other one. Compare this to how we, as humans, find field order -- we separate fields with TFF and BFF, and then we see which one is jerkier. This jerkiness simply means there is a lot of movement here, so interlace detection takes the left clip as being wrong and the right clip as being right. Knowing that the left clip represents the values for BFF, the correct field order must be TFF.

In sections of 10 fields (5 frames, but since we have done separatefields, we now have it as fields), the program notes the field order for that section. At the end, it counts the number of each type up and says that if there is more than 10% of the 'incorrect' field order, the field order changes throughout. If not, then it declares the field order as whatever it determined.

Filter-chain recommendation

There is no analysis in this stage. This just takes the results from all of the previous steps and recommends a filter based on the presets written by berrinam. In MeGUI, these are returned as a list of filters, in order from most highly recommended to least highly recommended. These are:

For progressive sources:

  • Do nothing.

For repitition-upconverted sources:

For interlaced sources:

For film sources:

For hybrid film/interlaced or film/progressive sources:

  • TIVTC with hybrid settings
  • IVTC with hybrid settings

For hybrid progressive/interlaced:

  • TDeint with full=false (only deinterlaces combed frames)
  • FieldDeint with full=false (only deinterlaces combed frames)

Note: if portions are used with hybrid progressive/interlaced, then it suggests the same filters as purely interlaced sources.

Portions

Portions is an idea for dealing with a particular type of hybrid interlaced/progressive sources. In particular, there are some sources which are progressive but have some small sections of interlacing, like the end credits. The idea in these cases is that the filter should only be applied on the section where the interlacing is present. Interlace detection should be able to work this out.

The problem with this is that it is too sensitive to falsely detected sections in the first stage of analysis. As a result, although it worked well on the mentioned sources, it was also recommended for sources that were purely interlaced. Perhaps it could be improved, but a solution that works almost as well is using TDeint or FieldDeinterlace with full=false, as these interlacers will just do the interlace checking, and on a per-frame basis instead of a per-section basis.

However, portions are still accessible through MeGUI, but off by default. For those wanting to experiment with them, have a look at 'Configure Source Detector' in MeGUI's settings.

Future improvements

Perhaps a better integration with AviSynth could be explored. This would mean removing the need for temporary files by directly requesting AviSynth's internal variable values. This would also allow interlace detection to only need to request information on frames that are actually relevant (for example, there is no need to check frames that aren't moving for combing).

True integration with AviSynth would be to develop an AviSynth plugin (or add a new internal function to AviSynth) that performs the analysis of the video, and sets new AviSynth variables which fully describe the interlaced/progressive properties of the video. (The remainder of the user's AviSynth script would presumably invoke appropriate filters, depending on the values of the variables.) Then no external analysis tools would be needed, and handling various types of video could truly be automated.

More tweaking of all the thresholds could always be done.

TFM-based testing

TFM is a field-matcher, which is often used as the first step in an IVTC. Running this with postprocessing turned off means that the result will be interlaced if the source is interlaced and the result will otherwise be progressive (perhaps with some duplicate frames needing to be decimated). Thus, the use of TFM could be in determining which of the interlaced-looking sources are in fact interlaced, and which exhibit some other artifact, like phase-shifing of fields. It also provides a completely different approach to source detection, which is described in the 'uses in anime' section.

Uses in anime

TFM-based testing could be particularly useful when dealing with anime. Anime often has many duplicate frames, and this could really confuse source detection, because it severely limits the number of 'useful' sections in its analysis. This means that the analysis is more susceptible to small mistakes.

A different approach to source detection, which doesn't require all the sections to have all five frames in motion is described as follows:

  1. Run the same initial avisynth analysis as described above.
  2. When processing these results, simply count up the number of combed frames and the number of moving frames across the entire sample. Express the combing as a ratio of numCombedFrames/numMovingFrames, which indicates how much combing there is. If this is lower than a certain threshold, the source is progressive. If not, it has *some form* of combing. There are then two options:
    • Analyse progressive sources in sections to count the number of duplicates in each section. If there is a common pattern, decimate by that number, otherwise leave it alone.
    • Test the combed sources with TFM as described in the following steps
  3. Apply TFM with no postprocessing to the source and run the same analysis as described in step 1 and step 2 again.
    • If the result is still combed, then the source needs deinterlacing. Check the field order and then suggest some deinterlacers.
    • If the result is now progressive, then check for a common pattern of duplicates to see if decimation is required. Otherwise, just suggest a field-matcher and be done with it.

This approach could also be applied to live-action sources, but I suspect it may take longer.

Testing the suggested filters

Unlike the savvy human, who tests the filter they just decided on in order to see whether they made the correct decision, interlace detection assumes that its analysis was correct, and doesn't make any move to check the results. While the recommended filters were chosen so as to minimize the damage from a wrong match, interlace detection would still benefit from testing the suggested filters to see if what they do is correct. Here is what testing could be done for the various source types:

Progressive:

  • There's no filter, and the source is already how we want it. No testing required.

Repitition-upsampled:

  • This could have the decimation filter applied and then be checked to see if there are still duplicates, but I don't really see how a decimation filter could fail.

Telecined sources:

  • If a source is detected as telecined, then it should be able to be field-matched. So, by checking the results from a field-matcher (TFM) without postprocessing, it would be clear that field-matching is possible if the results aren't combed, and impossible if they are combed. In the case of field-matching not being possible, the source must be genuinely interlaced, so that is what it should then be declared.

Interlaced sources:

  • If a source appears interlaced, it could either be genuinely interlaced, or it could have shifted fields. Like with the telecined sources, applying a field-matcher without any postprocessing would mean that phase-shifted sources would have a progressive result, but interlaced sources would stay interlaced.

Miscellaneous improvements

  • Rework the hybrid thresholds as described in [4]
  • Add a CLI interface
  • Code cleanup (there are always more rewarding things to do than this)

Credits

  • berrinam, algorithm, and integration into MeGUI
  • len0x, initial idea for algorithm
  • tritical, IsCombedTIVTC filter
  • neuron2, IsCombed filter
  • stax, interface improvements on the stand-alone application

And of course, there would have been no way to understand all of this without the level of detail about all of this supplied by Doom9's forum and the Doom9 guides.

Personal tools