NeuralNet

Author V. C. Mohan
Date Nov 12, 2005

The user must be familiar with Artificial Neural Networks for using this plugin efficiently

NeuralNet has:
A) classic 3 layer classification type neural network with only one output node.In the two functions included the optimization is done differently:
1) NeuralNetBP by usual back propogation
2) NeuralNetRP Resilient Propogation or RPROP (Reidmiller and Braun)
The output of these two functions is either yes (255 ) or No (0)
B)NeuralNetLN :Linear network with a single layer and weights optimized by RPROP method. The output of this is limited to be between 0 and 255.

The plugin has been tested with:
A)NeuralNetBP and NeuralNetRP: greyscale image as input and a corresponding edge detected image wherein edges (above a threshold) were marked as white(255) and non edges as black(0). Training was done using a representative window of this frame. The solution was tested on full frame .
B) NeuralNetLN : An image corrupted by regular freq interference and its 'VFanFilter'-ed output set. Note for high starting weights, results may be inferior.

RP converges faster than BP and so requires fewer iterations. However since the minimization criteria are different, the results can be different. In certain situations one method gives better results than other.

The start frame of input clip must be the frame to be used for training. The training clip must have the processed result corresponding to the start frame of input clip.

Training need to be done in a small window (or along a line) having fully representative cases and having full image amplitude range, as otherwise strange unexpected results may be seen. Time for training Depends upon window size, number of nodes, iterations and 'bestof' value and may take a few seconds to several minutes.

After training, the optimized weights are used for processing all input clip frames. NeuralNet RP and BP outputs all values above a threshold as white, and rest as black.

Facility to monitor error or other diagnostic parameters during training is provided. First output frame will always be a frame with diagnostic plot (and for classification type a histogram of the processed first frame). The diagnostic plot horizontal scale is number of iteration and vertical is scaled parameter value. For Histogram horizontal scale is %age of image y value. The window outline or line used for training is also shown on this plot

For BP and RP if 'test' is true, the input start frame 'sf' is repeatedly processed with NeuralNet solution and the output displayed using threshold values in 'ef' steps from 0 to 100 % with each frame step. If 'ef' is 50 then 50 steps will be used or 2% increase at each frame step occurs

In certain cases after some iterations the error may reach a local minimum, or may start to increase with iteration. Sometimes retraining with a different set of starting weights may correct this problem. 'Bestof' parameter processes with different weight sets and saves the weights which resulted in least error for later use. However this multiplies training time. 'wset' parameter skips weight sets

This plugin works in YUY2, YV12, RGB32 and RGB24 color spaces. Only Red or Luma Y channel is processed.

The input pattern is considered as a grid of 'xpts' * 'ypts'. The output is the value for the center of this grid on the training clip. The input grid is then moved by one pixel and compared with the corresponding output. The training line or window size is to be selected to have sufficient number of cases and so positioned that it encompasses all representative cases.

Incase the training clip was obtained with a grid of x by y, then specifying values other than x and y for xpts and ypts to the network may not be good idea as extraneous data will only add noise and the network may not converge.

Most of the parameters for the three functions are same. While all(except clips) have default values some parameter (training window, hnodes, weight) values need to be specified for better results. Also parameter names may be invariably used for specifying

Parameter Description:-

clip : clip. Clip to be processed No default

clip : clip. Training clip. No default

sf : integer: starting frame number to process on input clip. Default 0

ef : integer: end frame number to process on input clip. Default last frame

tf : integer: frame number on training clip that has proceessed results for 'sf'. default 0

lx : integer: at start frame process limited to a window with left x coordinate. Default 0;

ty : integer: at start frame process limited to a window with top y coordinate. Default 0;

rx : integer: at start frame process limited to a window with right x coordinate. Default frame width - 1;

by : integer: at start frame process limited to a window with bottom y coordinate. Default frame height -1;

elx : integer: at end frame process limited to a window with left x coordinate. Default lx;

ety : integer: at end frame process limited to a window with top y coordinate. Default ty;

erx : integer: at end frame process limited to a window with right x coordinate. Default rx;

eby : integer: at end frame process limited to a window with bottom y coordinate. Default by;

line :boolean: true for training patterns along a line. False for training patterns are in a rectangular window. Default true (line)

tlx : integer: training patterns line / window with left x coordinate. Default lx;

tty : integer: training patterns line / window with top y coordinate. Default ty;

trx : integer: training patterns line / window with right x coordinate. Default rx;

tby : integer: training patterns line / window with bottom y coordinate. Default by;

white: boolean : true if edges are marked white on training clip or false if edges are marked black. Default true (white). NeuralNetLN does not have this parameter.

xpts: integer: Odd number. number of x values in each pattern. default 9. Max xpts X ypts 121

ypts: integer: Odd number : number of y values in each pattern. default 9. Max xpts X ypts 121

hnodes: integer : number of nodes in hidden layer. default sqrt(xnodes X ynodes) 3 to 10. NeuralNetLN does not have this parameter.

iter : integer : number of iterations for training.default 3200 for BP and 200 for RP and 600 for LN

of : float : output weights range will be -0.5 to 0.5 times this value. Default 1.0. NeuralNetLN does not have this parameter.

hf : float : input to hidden layer weights range will be -0.5 to 0.5 times this value. Default 0.0005

activate : string : name of activation sigmoid "logistic", "alogistic", "tanh", "atanh", "gauss", "elliot" and "evero". Default "logistic". NeuralNetLN does not have this parameter.

bestof: integer : trains with this number of different starting weights and selects weights that produced least error sum. Training time increases proportionally.1 to 10 Default 1

wset : integer : skips to this starting weight set without training on them. 1 to 10. Default 1

thresh :integer: %of image value above which all values are considered as object for output. default 60. NeuralNetLN does not have this parameter.

test : boolean : true for test run. False for regular process. Default false NeuralNetLN does not have this parameter.

minstep : float: For BP only. This specifies stepping for quickest descent algorithm. Too small takes too many iterations to converge, too large may be erratic or not converge at all. Default 0.00005

cwt: integer : for BP only:1 to 25. If recognition of all (in this case edges) is more important even if a few non edges are included than missing some edges, then this factor may have higher value say 5. Default 1

plot : string: for RP only: name of diagnostic parameter to be plotted at regular iteration intervals. (output partial derivative)"odedwt", hidden layer partial error derivative "hdedwt", change in output node weights "dwo", change in hidden layer weights "dwh", output weight "owt" , hiddenlayer weights "hwt" and squared error sum "esum". Default "odedwt"

nplot : integer : for RP only: diagnostic plot for node number.Default 0.

Usage examples:-
LoadPlugin("............\NeuralNet.dll")
LoadPlugin("............\Colorit.dll")# this plugin has an edge marker function.Any such can be used
avisource("....")
g=grescale()
t=emarker(g,......)
# test mode with threshold display
NeuralNetBP(g,t,sf=0,ef=99,tlx=0,trx=639,tty=300,tby=312,xpts=3,ypts=3,iter=9000,test=true,wh=0.0005,wo=1.0, minstep=0.00005,cwt=5,bias=1,err=false,nth=300, bestof=1,hnodes=12)
processing mode
NeuralNetBP(g,t,tlx=0,trx=639,tty=300,tby=312,thresh=67,xpts=3,ypts=3,iter=9000,wh=0.0005,wo=1.0, minstep=0.00005,cwt=5,bias=1,hnodes=12)
NeuralNetRP(g,s,tlx=350,trx=550,tty=300,tby=306,thresh=80,xpts=3,ypts=3,iter=150,test=true,plot="esum",nplot =0,wh=0.01,wo=1.0, wset=2,bestof=1, hnodes=3, activate="gauss")
NeuralNetLN(f,i,0,1,tlx=250,trx=450,tty=0,tby=400,xpts=9,ypts=1,iter=600,wh=0.01, wset=1,bestof=1,line=true)

Below is an example of NeuralNetLN. On left is NeuralNetLN output, center is input, on right is the VFAN filtered frame from which a line is used for training.The script used is

f=imagereader("D:\TransPlugins\images\msintrf0.jpg",0,1,25,false).converttoyuy2()#image with noise
i=imagereader("D:\TransPlugins\images\fanmsint0.jpeg",0,1,25,false).converttoyuy2()#fan filtered image
ln=NeuralNetLN(f,i,tlx=250,trx=450,tby=400,xpts=9,ypts=1,iter=200,wh=0.01, wset=1,bestof=1,line=true)
stackhorizontal(ln,f,i)
reduceby2()

To my index page

down load plugin

To Avisynth