AviSynth+
This page will be dedicated to AviSynth+, mainly to keep track all of it's features, changes, bugs, and any other useful information.
Contents |
AviSynth+'s Plugin Autoloader
- 1st October 2013 | Source: here and subsequent post.
Okay, so how do multiple plugin directories interact with plugin autoloading?
As a recap, here is how it used to work in the official Avisynth:
- Look for the string HKEY_CURRENT_USER/Software/Avisynth/PluginDir2_5 in the registry. If it exists, load plugins from the path specified there and stop.
- If the above string didn't exist, look in HKEY_LOCAL_MACHINE/Software/AviSynth/PluginDir2_5. Try to load plugins from the path specified there.
- Done.
First thing to note, is that classic AviSynth only ever searches for plugins in one single directory. It only knows two directories (both specified in the registry), and it only tries the second path if there is no entry for the first one.
AviSynth+'s autoloader has a list of autoload directories. It iterates over all those directories and tries to load all plugins from each. But (and a big but!) it will not load a plugin from a directory if another plugin with the same basename is already loaded. The basename of a plugin is simply its file name without the extension.
The expected use case is that you can now overlay a new plugin directory on top of another one. AviSynth+ then would load all plugins from the first folder, then load only those plugins from the second that weren't loaded from the first, then those from the third that weren't loaded from the first or second and so on. For example, let's say your usual plugin folder has a lot of plugins you normally use. But at one time you have a small number of updated plugins that you only want to use from a few scripts, but you do not yet want to replace your existing plugins globally. Then you'd just add a new plugin overlay folder, with only the new plugins in it, and that's it. All scripts that specify the new folder will autoload all plugins from your usual one, except for the new plugins, which would get loaded from the new folder. All your other scripts will still use your old plugins.
By default, Avisynth+'s autoload folder list has four paths in it, in this order:
- PluginDir+ in Software/Avisynth in HKEY_CURRENT_USER
- PluginDir+ in Software/Avisynth in HKEY_LOCAL_MACHINE
- PluginDir2_5 in Software/Avisynth in HKEY_CURRENT_USER
- PluginDir2_5 in Software/Avisynth in HKEY_LOCAL_MACHINE
This means, if there are ever plugins which will only work with Avs+ but not with classic Avs, you can put them into one of the "PluginDir+" folders. AviSynth+ will then use the classic plugins from the normal Avisynth, but if there are versions of some plugins written for AviSynth+, it will use them instead, and the classic avisynth.dll will still not be bothered with them. This is all without you having to lift a finger (except for adding the "PluginDir+" values to the registry once, until we have an installer). So to summarize all this, you have the ability to define a plugin autoload folder in the registry which will only be used by Avs+, but not by Avs, in addition to your classic plugins.
New Functions
However, another new functionality offered by AviSynth+, is that now you can also specify autoload paths in the scripts. There are two functions for this:
AddAutoloadDir(string path, bool toFront)
: this will add a new autoload folder. The string parameter is obligatory, it is the folder path where to load from. The second boolean parameter is optional, and if true (default), it will add the path to the front/beginning of the autoloader's list, which means it will be searched earlier than the rest. If it is false, the path will get added to the end of the list, so it will get searched last (unless you again add another one to the end).ClearAutoloadDirs()
: This will clear all the paths from the autoloader's list. Note that it is NOT a reset to the default state. ClearAutoloadDirs() will clear all folders, so if you don't add new ones after that, you have disabled the autoload functionality. This is, BTW, also a way to disable autoloading for a particular script in AviSynth+.
Here's an important note: You can only call these functions if no plugin has been autoloaded yet. Autoloading happens if the first unknown function is looked up. This means you can only call AddAutoloadDir
or ClearAutoloadDirs
if you have only made calls to built-in functions up to that point in the script. I suggest you start your scripts with these calls to avoid any problems.
There is only one thing left to discuss: Are there any special directories you can reference from your script? You bet there are:
- SCRIPTDIR is the folder of the most current script. It is the path of the imported script if your script calls import()
- MAINSCRIPTDIR is the folder of your main script, the one where execution started
- PROGRAMDIR is the folder of the executable running the current script
- USER_PLUS_PLUGINS is the string stored in PluginDir+ in Software/Avisynth in HKEY_CURRENT_USER
- MACHINE_PLUS_PLUGINS is the string stored in PluginDir+ in Software/Avisynth in HKEY_LOCAL_MACHINE
- USER_CLASSIC_PLUGINS is the string stored in PluginDir2_5 in Software/Avisynth in HKEY_CURRENT_USER
- MACHINE_CLASSIC_PLUGINS is the string stored in PluginDir2_5 in Software/Avisynth in HKEY_LOCAL_MACHINE
... all these special constants are case-sensitive for now.
Examples
- If you want plugins to be autoloaded from the script's "autoload" directory too, you'd write:
AddAutoloadDir("MAINSCRIPTDIR/autoload")
- If you want plugins to be autoloaded from the script's "autoload" directory, only from there and nowhere else, you'd write:
ClearAutoloadDirs()
AddAutoloadDir("MAINSCRIPTDIR/autoload")
- If you wanted to manually recreate the default state of the autoloading folder list, you'd write:
ClearAutoloadDirs()
AddAutoloadDir("USER_PLUS_PLUGINS", false)
AddAutoloadDir("MACHINE_PLUS_PLUGINS", false)
AddAutoloadDir("USER_CLASSIC_PLUGINS", false)
AddAutoloadDir("MACHINE_CLASSIC_PLUGINS", false)
Notes
- Both Avs and Avs+ already query interface versions. They try to load the 2.6 interface from a plugin first, and if that is not supported, they try to load the 2.5 interface. Avs+ also tries to load the C interface if both of the previous ones fail. In the future, the C interface should probably be prioritized over 2.5.
- In what contexts do MAINSCRIPTDIR and the other 'special' names get replaced with the corresponding folders? In all strings, or only when used in the argument to AddAutoloadDir?
-- Only inAddAutoloadDir()
, and even there, only if they are at the very beginning of the string. These get replaced to absolute folder paths, so if they are not at the beginning of the string, replacing them would only result in an invalid path (e.g. you'd end up with "c:" in the middle of your path). - Source
- Avs+ autoloads plugins if any of the following happens:[1]
- AutoloadPlugins() is called
- LoadPlugin() is called
- A yet unknown (non-internal) function is called
- avs_function_exists does not find the external source filter in this case because none of the above happened. So MasterNobody's patch is the right thing to do.
Bugs
- http://forum.doom9.org/showthread.php?p=1672010#post1672010
- http://forum.doom9.org/showpost.php?p=1672010&postcount=678
- http://forum.doom9.org/showthread.php?p=1672333#post1672333
- http://forum.doom9.org/showthread.php?p=1672342#post1672342
GScript
GScript has been used as the starting point for the implementation in Avs+, and changes compared to GScript are not visible to the user (e.g. only important for Avs+ core developers). Two notable differences are between GScript and Avs+:[2]
- In Avs+ there is no need to use GScript(""" and """) to encompass your GScript-specific code parts. The language extensions became native to Avs+ and can be used transparently like classic AviSynth syntax.
- The "return" statement has been slightly changed to not only exit the inner-most code block, but to terminate the whole function (or script), as anybody with even the slightest scripting experience would expect. This is one of the very few incompatible changes compared to classic AviSynth.
MT Notes
- Source: Doom9 Forum
So, how to use MT in AviSynth+? Most of it has been posted earlier actually, but let me summarize it.
By default, your script will run in single-threaded mode, just like with SEt's build. Also, just like in SEt's build, you'll have to make sure that filters use the correct MT mode, or else they might wreak havoc. There are three MT modes (1,2,3), and they are the same modes as in (yeah you guessed correctly) SEt's build. Which means you can use the same modes that you have used with AviSynth-MT.
There are some things though that are different and/or new in AviSynth+. The first difference is *how* you set the MT mode. In AviSynth-MT, you had to use SetMTMode(X), which caused all filters following that line to use mode X (until the next call to SetMTMode()). This meant if you needed to use multiple MT modes, you had to insert all those calls in the middle of your script, littered over many places.
Setting MT modes
AviSynth+ does it differently. In AviSynth+, you specify the MT-mode for only specific filters, and those filters will then automatically use their own mode, even if there were other MT-modes inbetween. This means you can specify all the MT modes at the beginning without polluting your script. You can even make a SetMTMode.avsi if you wish and let it autoload for all of your scripts, or import() it from their top. This is much cleaner, and it allows you to maintain all your MT-modes centrally at a single place. To make this distinction clear from AviSynth+, SetMTMode() is called SetFilterMTMode() in AviSynth+.
Enabling MT
The other difference is how you actually enable multithreading. Calling SetFilterMTMode() is not enough, it sets the MT mode, but the MT mode only has an effect if MT is enabled at all. Note this means you can safely include/import/autoload your SetFilterMTMode() calls in even single-threaded scripts, and they will not be messed up. Uhm, onto the point: You enable MT by placing a single call to Prefetch(X) at the *end* of your script, where X is the number of threads to use.
Example
# This line causes all filters that don't have an MT mode explicitly use mode 2 by default. # Mode 2 is a relatively safe choice until you don't know most of your calls to be either mode 1 or 3. # Compared with mode 1, mode 2 trades memory for MT-safety, but only a select few filters will work with mode 1. SetFilterMTMode("DEFAULT_MT_MODE", 2) # FFVideoSource(), like most source filters, needs MT mode 3 SetFilterMTMode("FFVideoSource", 3) # Now comes your script as usual FFVideoSource(...) Trim(...) MCTemporalDenoise(...) ... # Enable MT! Prefetch(4)
Closing notes (don't skip!)
- Remember that MT is only stable as long as you have specified a correct MT mode for all filters.
- Instead of the numbers 1-2-3, you can also use symbolic names for MT modes: MT_NICE_FILTER (1), MT_MULTI_INSTANCE (2), MT_SERIALIZED (3)
- Mode 3 is evil. It is necessary for some filters, and it is usually no problem for source filters, but it can literally completely negate all advantages of MT, if such a filter is placed near the end of your script. Let us know if you meet a non-source mode 3 filter, we might be able to do something about it, but in general, avoid such calls if you want performance. (And of course, insert what you have found into here.)
- The new caches will save you a lot of memory in single-threaded scripts, but due to the way they work, they will also use more memory than before with MT enabled. The memory usage will scale much closer with the number of threads you have. Just something to keep in mind.
- MT-enabled AviSynth+ triggers a latent bug in AvsPmod. Until a new version of AvsPmod is officially released, use this build. A thousand thanks to vdcrim for the fix.
- Using too many threads can easily hurt performance a lot, because there are other bottlenecks too in your PC than just the CPU. For example, if you have a quad-core machine with 8 logical cores, less than 8 threads will often work much better than 8 or more.
Informational links
Links contain bits and pieces of how to MT works in avs, correct usage, and other things MT.
- http://forum.doom9.org/showthread.php?p=1658385#post1658385
- http://forum.doom9.org/showthread.php?p=1662222#post1662222
- http://forum.doom9.org/showthread.php?p=1667529#post1667529
- http://forum.doom9.org/showthread.php?p=1667977#post1667977
- http://forum.doom9.org/showthread.php?p=1668266#post1668266
- http://forum.doom9.org/showthread.php?p=1669680#post1669680
- http://forum.doom9.org/showthread.php?p=1670372#post1670372
- http://forum.doom9.org/showthread.php?p=1673093#post1673093
- http://forum.doom9.org/showthread.php?p=1673144#post1673144
- http://forum.doom9.org/showthread.php?p=1682034#post1682034
- http://forum.doom9.org/showpost.php?p=1668101&postcount=92
Help filling MT modes
Due to many problems with Riseup's Etherpad, the MT modes pad has moved.
- You can find the latest revision here: AviSynth+ MT modes, if you like to contribute please do so here.
http://pad.riseup.net/p/avs_plus_mt_modes
AviSynth Plugin Writing Tips
#1: Exceptions
- Source: Doom9 Forum
Exceptions thrown from a module should only be caught in the same module. Otherwise you can experience weird and hard-to-debug errors in the plugin. Not adhering to this advice will result in code that can sporadically fail, or work on your computer consistently but fail on other machines.
Unfortunately, avisynth.h contains the AviSynthError class, giving plugin authors the false impression that it is safe to throw and catch these exception objects. It is not. The problem is not in the definition of this class, but in the implicit encouragement to throw C++ exceptions across DLL boundaries. Here are some tips to avoid getting caught in the deepest pits of hell:
- When throwing exceptions on your own, it is best not to use AviSynthError. Not using it will stop you thinking that AviSynthError has some special meaning, or that it can be used to throw to (or to catch from) avisynth.dll.
- Exceptions thrown by you should always be caught inside your plugin. You should not let exceptions propagate outside of your DLL (unless thrown using ThrowError), to AviSynth.
- Errors thrown by Avisynth should not be caught by you. In specific, don't wrap calls to AviSynth in try-catch blocks, because you cannot rely on it working correctly in every situation. If you need to detect errors, validate user parameters in your plugin, or use other API facilities provided by AviSynth, like IScriptEnvironment->FunctionExists().
- If you want to throw an exception to the user and/or to AviSynth, then only use IScriptEnvironment->ThrowError(). You should not call C++'s "throw" yourself for this purpose (see 2. point), and you should not catch the error thrown by ThrowError() yourself (see 3. point).
- If you want to catch an exception, want to do something based on that and finally raise an exception to AviSynth, don't rethrow. Catch your own exception (unless thrown by ThrowError), then call ThrowError separately.
Ignoring the above tips can still result in a fully working binary, but that is only guaranteed under very specific circumstances, more specifically when you've compiled your plugin with the *exact* same compiler version as the avisynth.dll was compiled with, AND when linking to the CRT runtime dynamically. Given that plugin authors can use whatever compiler they want, and that an avisynth binary can be supplied by any community member, it is unwise to rely on such detail.
These tips apply to all AviSynth versions (e.g. to 2.5 and to 2.6, to "classic" AviSynth and to AviSynth+, etc).
#2: Parallel execution
- Source: Doom9 forum
AviSynth-MT and VapourSynth both support multithreading, and it is being implemented in AviSynth+ too. All of them require the same things from your plugin. Here is a list of what you as a plugin author can do to support execution on multiple threads. If you are the author of any AviSynth plugin, please update your filter according to these rules if needed. Doing so will make sure your plugin can execute seamlessly when multithreaded. Furthermore, following these rules will not only guarantee correct execution in multihtreaded environments, it will also provide optimal mulithreaded performance.
Short list for those on the run:
- Unless you have the slowest filter in the world, don't start threads in your plugin.
- Never use global or static variables. In addition, your filter class should only have read-only members which are initialized during construction.
- Don't reuse the IScriptEnvionment pointer between method executions.
And again, the same points with a bit of more explanation:
- In general, do not slice up your frame and start multiple threads on your own. Threading has its own performance overhead, and it is only worth doing it manually if your filter takes a lot of time to execute. And even if your filter is extremely slow (like fft3dfilter), you should try to optimize its single-threaded performance (by using SIMD instructions or choosing a more efficient algorithm) rather then manually threading it. Optimize for single-threaded performance, and as long as you follow the other rules below, you will get automatic and correct multithreading from AviSynth.
- Do not cache frames yourself. You might think it is efficient because you won't have to request/compute them in the next frame, but you are wrong, and there are several reasons why. First, if you write your own cache, you will have to introduce a global state to your filter, which means you will have to take care of synchronization between multiple threads too, which is not easy to do efficiently with caches. Second, keeping copies of past frames also means there will always be multiple references to them, thus AviSynth cannot pass them as write pointers to other filters, and will have to do an extra copy of it more often. And last but not least, AviSynth has a very extensive caching mechanism, and if you request the same frame multiple times (even when you need it for different frame requests), chances are you will get it for free anyway, so your own caching is just pure overhead.
- As a general extension to the previous rule, try not to keep any state between frames. In the optimal case your filter class should only have read-only members which are initialized during construction. Surely this is not always possible with every algorithm, but most times it is, and this is what you should strive for.
- As stated before, for best multithreading always try to implement algorithms which require no state between frames. Whenever this is violated, be sure to group reads and writes to the state (do not spread them), and guard them in critical sections (as few and as short as possible). For example, it is a good practice to copy all your writable class variables (in a critical section) at the start of each frame into local stack variables, compute the whole frame outside of the critical section (updating the local variables that captured the global state as needed), then write them back together at the end of your frame in another critical section. Do not request an automatic lock around your whole filter from AviSynth, because it will serialize your filter's execution.
- If you have class variables that must be writable in every frame, you will also have to keep in mind that AviSynth does not guarantee that frames will be processed in their natural order. Just a reminder.
- Only store variables in classes and in method stacks. Per-frame heap allocations should be avoided, because they can act as implicit synchronization points between threads. And most importantly, never store anything in static variables or in the (global) namespace scope. Read the previous sentence a few more times.
- Do not store the IScriptEnvironment pointer anywhere yourself (except locally on the stack), and never reuse those pointers outside of the methods where they were supplied to you. Not even between different executions of the same method! There is a reason why you get that pointer separately for each method, which is that it may be different every time, especially in multithreaded scenarios. If you reuse it, the consequences will be different between every implementation, but you can get anything from race conditions to program crashes.
Choosing your AviSynth header
- Source: Doum9 Forum
So you are writing your own AviSynth plugin (cool!), and obviously one of the first things you have to do in your code is to include the AviSynth header. But which one? With all the different header variants lying around it is easy to get lost if you haven't been following AviSynth's development for a long time. Should you copy the header from another plugin? Should you copy it from the AviSynth64 project to be 64-bits compatible? Should you take the 2.5 header as it is the latest release that is officially stable? Do you need SEt's AviSynth-MT header if you want multithreading compatibility? Do you need separate headers for 32- and 64-bits like most plugins ship it? Should you just take the latest header from the AviSynth 2.6 project? And what about AviSynth+'s header?
Fortunately, no matter how you answer the above questions, there is one (and just one) solution that is easy to implement and fits all needs: Use AviSynth+'s header. And if you'd like to know why, read on.
So let's tackle the above questions.
Should you copy the header from another plugin?
No. Most plugins are older then AviSynth project releases, and so they ship with outdated (and sometimes buggy) headers. Also, some plugins have both separate 32- and 64-bit sources, so you still wouldn't know which one to take. And if you are really unlucky, you might stumble on a plugin that was written for AviSynth 2.5, and using that header would be the worst of all your header-related options.
Should you take the 2.5 header as it is the latest release that is officially stable?
No. 2.5 is no more. Most plugins that have originally been written for 2.5 have been already recompiled for 2.6. Don't try to be smart and support both versions, because they are not compatible. 2.6 has been around for many years now, and the existing plugin ecosystem builds extensively around this version. Technically speaking, it is stable. Nobody uses 2.5 any more.
Should you copy it from the AviSynth64 project to be 64-bits compatible?
No. While AviSynth64's header will work perfectly if you want your plugin to *only* run in 64-bit mode, that is most likely not the case. That project isn't maintained any more, and thanks to that the 32-bit part is out of date.
Do you need seperate headers for 32- and 64-bits like most plugins ship it?
No. You will see plugins around that have both avisynth.h and avisynth64.h. Same for many applications hosting avisynth.dll. This is because the original AviSynth project never supported 64-bit processing (not even today), so these other projects took the 32-bit header from the latest AviSynth version that was available when they were created, and they took the 64-bit header from the AviSynth64 project. This resulted in an ecosystem where the 64-bit versions didn't see any improvements over the years. On the upside, avisynth64.h stayed stable. On the downside, the 32-bit and 64-bit headers started drifting apart. Nevertheless, a merge of the avisynth.h and avisynth64.h headers is easily possible, which is exactly what AviSynth+ has done. There is no need for two separate headers, it only results in additional code, complexity, and maintenance burden.
Do you need AviSynth-MT's header if you want multithreading compatibility?
No. While properly supporting multithreaded versions does require special coding considerations from plugin writers (see parallel execution), none of those considerations affect the choice of header. There is no API or ABI difference between multi-threaded and single-threaded AviSynth versions. You can perfectly support MT-capable AviSynth versions even if using the header from an AviSynth variant that has no MT-support.
Should you just take the latest header from the AviSynth 2.6 project?
No. This project (sometimes people refer to it as the "original" or "official" AviSynth, though somewhat incorrect) always has the latest version, but it will do you no good if you want to support 64-bit processing. You cannot compile your plugin using its header in 64-bit mode, which is why people started using avisynth64.h in the first place. Even if it decided to support 64-bit in the future, it wouldn't be compatible to the existing (and pretty large) 64-bit ecosystem anymore, throwing away all the 64-bit plugin and application development that has been done in the past 6 years or so. And as already said, using two separate headers is completely unnecessary and only leads to additional complications down the road.
What about AviSynth+'s header?
The headers of AviSynth+ are up to date in every aspect and provide the greatest possible compatibility. By using AviSynth+'s headers, applications and plugins can cleanly compile and run in 32-bits and 64-bits. It is 100% compatible to the latest 32-bit development on the original AviSynth 2.6 project, while supporting all 64-bit binaries. And of course, you can use it regardless if you support multithreading or not. Furthermore and importantly, it is fully compatible to installations of the AviSynth 2.6, AviSynth-MT, AviSynth64, and of course the AviSynth+ projects, so your plugin/applicaiton will be able to run on any user's machine.
Writing better AviSynth plugins
By tp7
Lately I’ve been doing a lot of AviSynth-related development, mostly improving older plugins and making them available on x64. I’ll try to give some tips to fellow AviSynth devs, hopefully helping them improving the quality of their plugins and maintainability of their codebase. Without the further ado, let’s begin.
Stop YUY2
Currently there are five types of YUY2 support in AviSynth world:
- Convert YUY2 frame to planar, process it and convert back. This is one of the most common solutions and it was pretty much the only option in avs 2.5. Example: deblock.
- Use a specific YUY2 path but leave it completely unoptimized and possibly broken because well, “no one uses YUY2″. Example: msharpen.
- Convert both YUY2 and planar to some intermediate format and use the same set of routines to process both. Possible example: ttempsmooths (I’m not done reading its code so I might be wrong here, this way still might be used somewhere).
- Support YUY2 only, optimized. Example: layer (avs core).
- Have separate code paths for both YUY2 and planar, both optimized. Examples: some filters in avs core I don’t remember.
Most YUY2-filters fall either into the first two categories. And actually, if you think about it – none of these options are good. (1, 3) waste memory and time for conversion between planar and interleaved formats, (2) requires you to maintain two code paths but with assumptions that no one will be using the second one (why bother with it at all then?), (4) doesn’t support planar and (5) takes a lot of effort.
So what do? The answer is simple: let it go. In AviSynth 2.6 (and avs+) there’s a new colorspace, called YV16, which is the same YUY2 format except planar. So you can process it with the same planar routines you’re using for YV12, Y8 and YV24. Zero effort on your side. And users? They’ll be calling ConvertToYV16().a_lot_of_filters().ConvertToYUY2() if they have yuy2 source and want to keep it. You don’t waste a lot of memory and time on converting in each filter in between these convert calls and AviSynth built-in YUY2<->YV16 conversion is very fast (optimized up to SSSE3 in avs+ I think). Not supporting YUY2 is, in most cases, better for them too.
This somewhat applies to RGB too except there is no planar RGB format yet. You can losslessly convert it to YV24 and back but it’s kinda hacky. We should probably add some planar RGB to avs+ in the future.
Stop C++
Okay, this might sounds a bit strange, considering I’m one of the people who don’t understand why people write C when there’s C++. No, I’m not suggesting you to write plain C, but rather restrict C++ things you’re using. There are tons of guides on this question, just look around for some. The most important issue I have with it: stop overusing member functions. They’re terrible – they allow you to use any variable in the same class, dramatically increasing the scope size.
Imagine you see prepare_buffer(src_frame, buffer)
inside GetFrame
, where src_frame
is PVideoFrame
and buffer
is a raw uint8_t pointer
. If prepare_buffer is a free function outside of the class, you can assume that it just takes the frame and writes it to the pre-allocated buffer in some way. If this is a member function, you can’t assume anything. Does it modify any class variables? Does it depend on these class variables having some value? You have no way of knowing this and need to go and inspect the function code. In large codebases it instantly makes the code a lot harder to understand.
Another, although not so important issue, is taking pointers to these functions. I usually template the same function for different instruction set and store a pointer to in as a class variable, doing dynamic dispatch in constructor and calling this function through a pointer in GetFrame, and doing this with member functions is a lot harder. Also, having free functions simplifies porting to other codebases and frameservers with only C api, e.g. VapourSynth. Of course it’s possible to migrate the whole class, but why?
Stop implementing memcpy
This is probably not so relevant for newer plugins but you can see this a lot in older ones. A single routine called memcpy_amd being copypasted across many plugins. The purpose of this was to copy frames faster compared to memcpy and built-in BitBlt methods. Yeah it probably wasn’t such a bad idea some years ago. Does it make sense now? Not at all.
Current memcpy in MS runtime is optimized for SSE2. This is still somewhat slower than the old memcpy_amd routine in some cases but it’s fast enough. Unless copying frames is all your filter is doing, you aren’t gonna notice the difference. And if you do, there’s a better approach – env->BitBlt. This functions uses memcpy_amd internally if certain conditions are met and if they don’t, fallbacks to default memcpy. Of course this is an implementation detail and you should not depend on it, but it’s reasonable to assume BitBlt won’t get any slower in the future.
Unfortunately, BitBlt is not able to use the most efficient memcpy_amd routine every time. For it to be used, passed parameters should meet a simple condition: dst_pitch, src_pitch and width should be equal. Obviously you can just ensure this condition is met on your side if you always want to use the most efficient copying method. Ultim has this crazy idea to define BitBlt as a function that always uses memcpy_amd tier routine, but this idea is quite bad and might not get implemented in avs+ at all.
Stop copypasting code for different planes
DegrainMedian seems to be the most severe example of this: here’s a part of its GetFrame method, defined in degrainmedian.cpp. You can see that the very same code with some changes is copypasted for three planes. The same kind of code with minimal changes is copypasted for progressive routines. Makes you wonder what kind of programmers write the plugins you use daily.
What’s a better way to do this? First of all, you can process all planes in a loop. Generic version looks somewhat like this:
const static int planes[] = { PLANAR_Y, PLANAR_U, PLANAR_V }; for (int pid = 0; pid < (vi.IsY8() ? 1 : 3); pid++) { int plane = planes[pid]; int width = dst->GetRowSize(plane); //more code }
This handles all existing planar colorspaces, which can have either one or three planes (won’t work for planar RGB32 which we might add in the future though). Inside the loop body, variable plane will have the value of the current plane, e.g. PLANAR_Y, so you can use it in calls to AviSynth API as usual.
One more thing about that DegrainMedian code – there’s no point in doing dispatching at every frame in a huge if-else block. Since parameters don’t change during processing, you can either do it in constructor the way it is now, or use a simple lookup table of processors, automatically selecting it based on provided parameters (which can be used as array indexes). This improves readability a lot, making your code more declarative. You can check Fog’s C++ guide for some additional info on dispatching (and a lot of other very useful things).
Don’t be afraid of alloca
Yes, I know this is a bad programming practice, but well, most of video processing is one huge bad programming practice. The point of alloca is to do practically free memory allocations on stack, in cases where you’d usually use a static array but you don’t know the size at advance and you don’t want waste some memory by preallocating a static array with maximum allowed size. Example usage: Average plugin. In this case you could replace it with preallocated buffer but there are cases when it’s harder to do – for example, SangNom2 uses it to store a line buffer (one full line of video, which is smaller than stack frame size for any reasonable resolution). Using stack allocation you can avoid costly memory allocating with new/delete on every frame while keeping your GetFrame re-entrant.
But there are some pitfalls with alloca. First, you should never use it after the function it was created in returns as the pointer won’t be valid anymore. You also have to call destructors yourself because there’s no delete[] call. And you have to zero allocated memory if you ever store PVideoFrame in it because when you write something like
memory[i] = child->GetFrame(n, env);
destructor of the frame in memory[i] will be called. Inside this destructor, PVideoFrame tries to call VideoFrame’s Release method if the pointer to it is not null, which will fail because said pointer points to garbage instead of a real VideoFrame. Calling memset on this memory prevents this. Also you should never make any assumptions on alloca alignment, so it’s probably better to avoid using aligned loads in SIMD when working with it (vc110 seems to return at least 16-byte aligned memory though).
In general you should prefer static arrays over alloca just because it’s a bit simper and also handles contstuction/destruction problems automatically. I’m probably overusing it a bit. Still, I consider using it a better practice than doing a heap allocation on every frame.
This is it. There are more suggestions left like “stop using VC6″, “stop static linking” and “drop MMX” but those are obvious anyway.
Useful links
- Intel Intrinsics Guide - originally this guide was hosted at
http://asm.avs-plus.net/using the Web GUI for Intel Intrinsics Guide.
Changelog
5th February 2014
- Source: Doom9 Forum
But I bring you now a shiny new test build, r1689, which - in contrast to the previous one - is actually usable. Probably even much better than that. So I strongly suggest everyone to give it a shot, aside that I can probably improve on the thread scheduler for some more performance, you'll have fun with it (in a good way). Just make sure you set the correct MT mode for your filters. Here is a snippet that you can start with, but please add some new filters to that list on your own too. Don't bother with built-in filters though, they're already handled internally, so you only need to add filters from external plugins.
The MT branch is now also the main branch of AviSynth+, which means all (even non-MT) improvements end up here, and it will be merged into "master" as soon as it has received enough testing. But now you might wonder what are the "other" user-visible changes compared to the stable release. Mainly:
- innocenat has worked more on the resizers, which are now even faster and require less memory, especially (but not only) when working on planar video and you have SSE3.
- This issue is fixed, which sometimes caused that a filter function from the wrong DLL would get used. Thanks for reporting the issue, real.finder.
- If a plugin DLL cannot be loaded, a human-readable error from Windows is also displayed, giving the user a clue what is wrong.
- Filters that reserve memory on construction can now share that same piece of memory between multiple instances, giving large memory savings in many cases. Most affected internal filters have been updated to make use of this capability, most by tp7. Some external filters will follow when the API is officially stable.
- The new caching system is done, and it will result in noticeably lower memory usage than any previous "classical" AviSynth version. Give it a try, you'll be surprised how much memory it brings in complex scripts.
- SSE2 has been a requirement for Avs+ by mistake, this is fixed now, and you now only need an SSE-machine.
Of course all the above is paired with a lot of rewrites, refactorings, and cleanups. And then there's MT.
2nd January 2014
- Source: Doom9 Forum
This is the fourth bugfix release in the current stable series, bringing you:
- Reduced memory consumption in resizers.
- The fix for this crash.
- A fix for correct detection of AVX capability.
- A small "for"-loop change for compatibility with Gavino's original version.
- A fix for bad alignment in the crop filter.
- A compatiblity fix in BitBlt.
- And a fix for a bug that resulted in autoloaded functions sometimes taking precedence over a non-autoloaded version.
Okay, so much for the bugfixes. Boring but at least useful. If you want something cool though, try out and help me test the new caching system in this experimental build. Make sure you rename the file to "avisynth.dll" if you try it out.
The experimental build is mostly the same as the just released r1576, except that it has the new caches, so if you're doing comparisons, please compare the experimental build to the r1576 release in this post. The new caches have been written from scratch with MT in mind, and although MT is not yet active in this build, the new caches will (or should) provide similar performance to the stable release, but with significantly reduced memory consumption. Let me know your experiences. The sooner I can deem the new caches "good enough", the sooner we'll see MT.
8th December 2013
- Source: Doom9 Forum
A second bugfix release is available. Besides proudly wielding the version number "0.1 (r1555)", the most important changes are:
- A fix for the autoload issue reported here.
- A fix for TemporalSoften which potentially resulted in crash.
- A fix for some filters not loading under specific circumstances. Discovered on WriteFileStart.
- A fix for the "return" script statement not returning from the current function if used inside if/while/for etc.
This release is a nice opportunity for you to try out AviSynth+ if you didn't already. Unless some major issue pops up, the next release will bring larger changes.
There is a slight change in behavior in r1555, made necessary by the fix for the autoload issue. Previously, plugin autoloading started automatically if forced by the AutoloadPlugins() function, or if an unknown(=external) function was found. Beginning with this release there is also a third condition, autoloading will also happen if any LoadPlugin() is issued, and it will happen right before the LoadPlugin() is executed. This was necessary to preserve compatibility with scripts for classic AviSynth.
Note: The AutoloadPlugins()
function simply forces autoloading at the point it is called. Not useful for scripts, it was meant to support editors like AvsPmod. [3]
26th November 2013
- Source: Doom9 Forum
Hello folks, here is the bugfix we promised to you. Issues with the installer are hopefully fixed, and the plugin loader got two patches too for things that have been reported. This had to stay off the fix-list though, I didn't even get to look at it due time (but I have a pretty good guess what is going on). I will reach in a fix for that another day, shortly. Until then, enjoy line0's updated installer look and icons.
Oh I almost forgot, there's a zip-release too.
24th November 2013
- Source: Doom9 Forum
A new release
Yes we have a new release and a cool one that brings full 64-bit functionality. But here's the semi-detailed changelog.
- A small number of bugs that were regressions compared to classic AviSynth have been fixed. This includes a bitblt copy error, and a script evaluation error that made it necessary to explicitly give the „last“ clip as input to some rare filters. It was discovered on animate(), but it might not have been the only one. The 64-bit version also had non working versions of Amplify(DB), Normalize, and MixAudio, which should be fixed now.
- tp7 and innocenat have finished porting all built-in filters to compiler intrinsics. This is a truly great accomplishment not only becasue it gives us a fully working 64-bit version, but the previous assembly code (which is now gone) was a large obstacle in reaching linux/osx compatibility too. We are still not cross-platform, but their work has brought us a large step closer. Not to mention it also allowed us to get rid of the SoftWire library, which brought down the binary size by 50% (though you might not see this if you compare it to UPX'd versions). I should probably also underline how much work this has been for them, they updated like 19.000 lines of code!
- As yet another consequence of tp7's and innocenat's work, the speed of many internal filters has greatly increased, in some cases 150% or more.
- There is now a shiny new installer. After qyot27's installer update for AviSynth+, line0 brought it another step further and has rewritten the old AviSynth installer from scratch. Compared to the old installer, you not only get a nicer graphical look, but also comprehensive migration options from classic AviSynth too, as well as unified x86/x64 support. Line0 is also working on high-res icons. This is still work in progress, but you can see preliminary results in the installer's icon.
Last but not least, we give special thanks to a random stranger* (see EDIT) who has pioneered in introducing Pig Latin translations to the software world. His work is unfortunately not yet included due to purely technical reasons, but I'm sure that many will appreciate his contribution when we finally do, especially native speakers of Pig Latin.
Homepage and IRC
Also kind of important news is that AviSynth+ now has a homepage, reachable under avs-plus.net. It is hosted by GitHub and is a bit minimalistic right now, but for sure a better landing page than GitHub's repository dump. There is also a new #avs-plus channel on Rizon for all IRC lovers, and in addition to this forum, you are welcome to influence development of AviSynth+ there too.
A few notes on the porting effort, ASM and future plans
By tp7
- A lot of ASM in the core was quite terrible. We actually had to remove HorizontalReduceBy2 YUY2 ISSE implementation because it was slower than the C code. There were some quite good MMX routines though (SSE2 was awful everywhere but resizers). Resizers were good.
- As mentioned in the first pull request, the general rule was "not slower than original on Nehalem+ CPUs". We did not test on any older CPUs. Expect performance to get a bit worse on Pentiums and I'm not sure about some memory-bound filters on Core 2. Please report if you experience a noticeable performance drop in the core filters on Core 2 level CPUs. We will not be spending a lot of time optimizing for pre-Nehalem CPUs though.
- All filters now have C versions so you can run them on super ancient CPUs. It will also help non-x86 platform support.
- All filters now have SSE2 versions. This for example means up to two times faster TemporalSoften. Some also got SSSE3 and SSE4.1 optimizations. You can find which one in the commit messages of the pull requests.
- There are some behavior changes: TemporalSoften mode 1 is removed, mode parameter is simply ignored. Blur MMX parameter and Tweak SSE parameter are also ignored.
- MMX optimization routines are dropped if there is a faster ISSE version. This affects only a few filters and some extremely old CPUs.
- Code from FTurn is now integrated into the core (with some additional optimizations and new RGB32 routines), making the plugin obsolete.
- We did not port MMX code of any audio filters. We won't do this any time soon, feel free to contribute.
- Resizers are implemented as VerticalResizer().Transpose().VerticalResizer().Transpose() instead of two separate routines for vertical and horizontal resizing. Some rounding differences are possible, although not noticeable. This improves performance in most test cases and simplifies implementation quite a bit.
YUY2 resizer is also implemented as ConvertToYV16().Resize().ConvertToYUY2(). This does not affect performance in any way on the CPUs we were working on. Conversion is lossless and extremely fast.
15th October 2013
I have just updated the online repository and the binaries in the first post with a new version, compiled fresh today. There are some goodies here, so let's see:
- First, Gavino's scripting extensions have been integrated. Be sure to say thanks to Gavino for his work again. I've only made minor modifications to his patches, like properly handling empty "if" blocks or missing optional "else" parts, and taking the "last" variable from before the new "if", "while", and "for" statements better into account. I've also added a "break" statement that will allow you to jump out of any loop without reaching the terminating condition. I've counted the votes for the "for"-style carefully, and your votes turned out equally even, so in the end I picked Gavino's original simplified style. You can still write any kind of sophisticated loop using "while".
- I added back proper AvsPmod support, that got temporarily removed earlier 'coz of the plugin system's rewrite. Built-in functions will now work as before, but to get autoloaded functions to show up in AvsPmod will need a slight modification to AvsPmod. This is unfortunately necessary, because autoloading has to be delayed to support adding search folders in scripts. As soon as AvsPmod gets modified you'll have full functionality back again. The needed modifications won't interfere with traditional AviSynth.
- The C interface is now probed before the 2.5 plugin interface, making ffms2 work again even if you're not using qyot27's latest build.
- The "crop" function now defaults to aligned crop. You can still control alignment using its second parameter, but if you omit it the default is now for the new frame to be aligned. This is important for plugin authors so that they can have a stronger alignment guarantee, in the end leading to faster processing in multiple plugins.
- And as always, there are cleanups and refactors, in an ongoing effort to make the sources higher quality. Not as many as in previous releases, but still. Of course more to come in the future.
64-bit support
There's actually one more feature item missing from the above list. The archive in the first post includes both 32-bit and 64-bit builds. All ASM from the core got replaced with C-code or intrinsics, the inline ASM in filters got sandwiched in ifdefs, and I fixed up any remaining issues that prevented the core from working correctly. But before rushing to download to run your uberscript in 64-bits, please note that no porting of the builtin filters has been done yet, making that build hardly usable. You'll find that many essential filters are missing, like resizers and color space conversions, just to name a few. So trust me, as a user you are most likely better off using the 32-bit build for now. But the 64-bit is there for the adventurous, for motivated testers and developers, and for all those who are wishing to help port the missing functionality to 64-bit. Besides, the new 64-bit build is compatible with existing 64-bit plugins found here, here and here, so you might actually be able to use the 64-bit version for something if you don't rely much on internal filters.
So, there is a working 64-bit build, though fairly gutted out. If you know some intrinsics, please consider helping out with the port, even if with only one or two routines. Even without knowledge of intrinsics or ASM, if you can rewrite some algorithms that were only available in ASM before into plain C, that would already be a lot of help. Feel free to start anywhere you like, and rest assured, I'm also continuing my work on AviSynth+.
1st October 2013
Yuhuuuu, new build, and new code on GitHub!
So, what has changed:
- First of all, the crash-on-out-of-memory bug is hopefully fixed. Should be.
- There is a brand new plugin-system in place, and if you work with scripts that use a lot of plugins, you should notice that they load faster.
- You can have multiple plugin directories. Exact semantics in my next post.
- LoadCPlugin (or Load_Stdcall_Plugin) is now a synonym for LoadPlugin. LoadPlugin will load C-plugins and LoadCPlugin will load normal plugins. They are one and the same. No difference.
- Hence, C plugins are also autoloaded.
- LoadVFAPIPlugin() is out of order for now. I'm not planning on removing it, I just need some info how to correct it.
Changes noteworthy for developers:
- Invoke finally stops throwing exceptions as a "normal condition" -> better debuggability
- VFAPI and VirtualDub filter loading are now separated into their own plugins, and are not in core any more.
AviSynth+ x64 plugins
All listed plugins are the latest version unless stated otherwise.
Name | Version | Download | Comments |
---|---|---|---|
AddGrainC | 1.7.1 | AddGrainC-1.7.1.7z | Compiled with Microsoft Visual Studio C++ 2012. AddGrain v1.7.0 compiled with Intel Parallel Studio XE 2015 Composer Edition for C++: AddGrainC_1.7.0_x64.zip |
AutoAdjust | 2.5 | AutoAdjust-v2.50.7z | |
AutoCrop | 1.2 | autocrop_3-14-2010.rar | Compiled by Joshy D, source code not available |
Average | 0.92 | Average-x64.zip | Compiled with Microsoft Visual Studio C++ 2012. |
aWarpSharp2 | 20120328 | aWarpSharp_20120328_x64.zip | Compiled with Intel Parallel Studio XE 2015 Composer Edition for C++ |
BassAudio | 2.4 | BassAudio_x64.7z - source | Compiled by yo4kazu - BASS audio library for Win64 |
Checkmate | 0.9 | checkmate-x64.zip | Compiled with Microsoft Visual Studio C++ 2012. |
CLExpr | 0.91 | CLExpr-x64.zip | Compiled with Microsoft Visual Studio C++ 2013. |
CombMask | g2ec6679 | CombMask-g2ec6679-x64.7z | |
DeBlock | 0.9 | Deblock-x64.zip | Compiled with Microsoft Visual Studio C++ 2012. |
Decomb | 5.2.4 | decomb_5.2.4_x64.zip | Compiled with Intel Parallel Studio XE 2015 Composer Edition for C++ |
DeGrainMedian | 0.8.2 | DeGrainMedian64.zip - source | Compiled by squid_80 |
Delogo | 0.05a | delogo_avs+.zip | Compiled with Intel Parallel Studio XE 2015 Composer Edition for C++ |
dfttest | 1.9.4 | dfttest-1.9.4_x64.zip | Compiled with Intel Parallel Studio XE 2015 Composer Edition for C++ |
DGDecIM | b50 | dgdecim_b50.zip | Requires license from DGDecNV |
DGDecNV | Requires license. | ||
DGMPGDec | 1.5.8 | DGDecode_3-19-2010.rar | Compiled by Joshy D, some IDCT modes are missing. |
Dither | 1.26.5 | dither-1.26.5.zip | Compiled with Microsoft Visual Studio C++ 2012. |
DSS2mod | 2.0.0.13 | avss_x64.zip | |
eedi3 | 0.9.1 | eedi3_64.dll | The latest version is 0.9.2! |
ExactDedup | 0.03 | ExactDedup+Version+0.03.zip | |
f3kdb | 1.5.1 | flash3kyuu_deband_1.5.1_x64.7z | v2.0 prerelease (b98d6bc x86/x64): f3kdb-b98d6bc.rar - compiled with Intel C++ Compiler 2013. f3kdb-rev410.7z - compiled with Microsoft Visual Studio C++ 2013. |
FFmpegSource | 2.20 | FFMS2 | |
FFT3DFilter | 2.1.1 | FFT3DFilter_3-12-2010.rar | Needs the 64-bit libfftw3f-3.dll to be in your System32 directory. Compiled by Joshy D. |
FFT3DGPU | 0.8.2 | FFT3DGPU_3-15-2010.rar | The HLSL (shader program) file is edited from the original to adhere to pixel shader 3.0 syntax rules. Please make sure to place the correct file in the same directory as the 64bit plugin.
Compiled by Joshy D. |
FluxSmooth | 2nd December 2010 | FluxSmooth SSE DLLs.7z | Discussion thread |
FRIMSource | 1.25 | FRIMSource64.dll | |
Fusion | 5th March 2013 | fusionx64.zip | |
hqdn3d | 0.11 | hqdn3d_4-08-2010.rar | Compiled by Joshy D, source code is not available. |
IT_YV12 | 0103_width8K | IT_YV12_0103_width8K.zip | |
Its | 0.8.6 | Compiled by putin999 | |
JincResize | r44 | jincresize_r44.zip | Compiled with Intel Parallel Studio XE 2015 Composer Edition for C++ |
LSMASHSource | L-SMASH-Works | Compiled with Microsoft Visual Studio C++ 2013. | |
MaskTools2 | b2 | masktools2-x64.zip | Compiled with Microsoft Visual Studio C++ 2012. |
MedianBlur2 | 0.94 | MedianBlur2-x64.zip | Compiled with Microsoft Visual Studio C++ 2012. |
MipSmooth | 1.1.2 | MipSmooth64.zip - source | Compiled by squid_80 - discussion thread |
MosquitoNR | 0.10 | MosquitoNR_0.10_x64.zip | Compiled with Intel Parallel Studio XE 2015 Composer Edition for C++ |
MSharpen | 0.9 | msharpen-x64.zip | Compiled with Microsoft Visual Studio C++ 2012. |
MVTools | 2.6.0.5 | mvtools_2.6.0.5_x64.zip | Compiled with Intel Parallel Studio XE 2015 Composer Edition for C++ |
NicAudio | 2.0.5 | NicAudio2.0.5_x64.zip | Latest version is 2.0.6 |
nnedi3 | 0.9.4.9 | NNEDI3_v0_9_4_9.7z | Compiled by jpsdr, discussion thread. Original nnedi3 v0.9.4 compiled with Intel Parallel Studio XE 2015 Composer Edition for C++: nnedi3_0.9.4_x64.zip |
RawSource26 | g3c7da5a | RawSource26-g3c7da5a-x64.7z | |
RemoveGrainHD | 0.5 | RemoveGrainHD_0.5_x64_bin.zip | |
ResampleHQ | v6 | ResampleHQ-v6.zip | |
RgTools | 0.92.1 | RgTools-x64.zip | Compiled with Microsoft Visual Studio C++ 2012. |
SangNom2 | 0.35 | SangNom2-x64.zip | Compiled with Microsoft Visual Studio C++ 2012. |
SCXvidMask | 1.0 | SCXvidMask-x64.zip | Compiled with Microsoft Visual Studio C++ 2012. |
SmoothAdjust | 3.0 | SmoothAdjust-v3.00.7z | |
SmoothD | 0.0.9pre2 | SmoothD_x64.zip | Compiled with Intel Parallel Studio XE 2015 Composer Edition for C++ |
SmoothD2 | a3 | SmoothD2-a3_x64.zip | Compiled with Intel Parallel Studio XE 2015 Composer Edition for C++ |
TCannyMod | 0.1.1 | TCannyMod_x64.zip | Compiled with Intel Parallel Studio XE 2015 Composer Edition for C++ |
TColorMask | 1.2 | tcolormask-x64.zip | Compiled with Microsoft Visual Studio C++ 2012. |
TEMmod | 0.2.0 | TEMmod-gef1cacc-x64.7z | |
TIVTC | 1.0.5 | TIVTC_3-13-2010.rar | Compiled by Joshy D |
TMaskCleaner | 0.91 | tmaskcleaner-x64.zip | Compiled with Microsoft Visual Studio C++ 2012. |
TimeLapseDF | 1.0 | TimeLapseDF64.dll | |
TNLMeans | 1.0.3 | TNLMeans_3-20-2010.rar | Compiled by Joshy D, source code not available. |
TTempSmooth | 0.9.4 | TTempSmooth_3-20-2010.rar | Compiled by Joshy D, source code not available. |
xy-VSFilter | 3.0.0.306 | xy-VSFilter_3.0.0.306_x64.zip | |
yadif | 1.7 | yadif_1.7_x64_asm.zip | Compiled with Intel Parallel Studio XE 2015 Composer Edition for C++ |
yadifmod | 1.0 | yadifmod_x64.zip | Compiled with Intel Parallel Studio XE 2015 Composer Edition for C++ |
VariableBlur | 0.5 | VariableBlur05_x64.7z - source | Compiled by yo4kazu - Note: this version outdated, v0.7 is the latest version. |
Vinverse | 0.9 | vinverse-x64.zip | Compiled with Microsoft Visual Studio C++ 2012. |
VSFilterMod | r90 | VSFilterMod64.dll | Note: this version outdated, r111 is the latest version. |
WarpSharp | 2008 | warpsharp64.zip | |
Zoom | 20140216 | Zoom.7z |
More 64-bit filters can be found in the following sites but be aware that some of the plugins listed are outdated.