Thursday, February 26, 2009

Dynamic Textures can go Martin my distance!

Thanks to Antoni Chan, I've finally figured out the last piece of the puzzle to compute the Martin Distance between two dynamic texture models. Antoni was kind enough to send me a short paper which described how to calculate Oa*Ob where Oa and Ob are infinite observability matrices (each has a finite number of columns and infinite number of rows). He computes Oa*Ob by solving the discrete-time Lyapunov equation (I'll describe the whole process in greater detail in my final report).

Suffice it to say, I finished creating a function to compare two dynamic texture video sequences. The function finds the dynamic texture models of the two videos, it then then passes these two models to the Martin Distance function which in turn computes the distance between the two models based on the subspace angles between the models.

Next I created a confusion matrix of the 8 segmented fire videos and 8 segmented non-fire videos. The values are the Martin distance times the complex conjugate of the Martin Distance. On the x axis, the movies go fire videos 1-8 and then non-fire videos 1-8. The y axis has the same labeling.


Here's that confusion matrix where the diagonal has been set to infinity


The following are examples of segmented fire texture videos and segmented non-fire videos.




And here are the picture previews of the movies used to create the confusion matrix.


The following are tables show the closest non-self match for each video. Remember, videos 1-8 are fire videos and 9-16 are non-fire videos. As you can see, the only misclassification is for video 8. Other than that, ever fire video gets matched with another fire video and every non-fire video gets match with a non-fire video.

Monday, February 23, 2009

More Dynamic Texture Stuff

So I came up with a solution to the virtual inability of Matlab pre v7.7 to work with video data. I ended up creating a program to spit out the frames of a video as .jpg files, then feeding those images to a matlab .m file to create a Matlab video, and finally downloading the latest Indeo video codec so that matlab could resave the video as an .avi file that it could read in the future. A time consuming and ass-backwords way of dealing with the problem but it saved me from having to buy the latest version Matlab :-P

Next I read a bunch more on dynamic texture analysis. According to the work of Doretto, Soatto and others, dynamic texture recognition can be viewed roughly as a three stage process
  1. learning the texture models
  2. calculating the distance between the distance between the models
  3. classify candidate dynamic texture using nearest neighbor or other approach
I've currently got a matlab model which performs step 1, and step 3 should be really easy. However, I'm still trying to understand how to compute the Martin distance defined in R. J. Martin. A metric for ARMA processes. IEEE Transactions on Signal Processing, 48(4):1164–70, April 2000.

In the mean time I've also started to create a small database of positive and negative dynamic fire-texture example videos.

Wednesday, February 18, 2009

Devling into Dynamic Textures

This past week I've shifted gears toward understanding and applying dynamic texture recognition to fire detection. I started by reading Dynamic Texture Recognition by Saisan, P., Doretto, G., Wu, Y. N., and Soatto, S. which was a very confusing experience since it was missing all of the critical details. I then read the precursor paper Dynamic Textures by Doretto, G., Chiuso, A., Wu, Y. N., and Soatto, S. which helped fill in most of the missing parts for me.

The next thing I did was to try to reproduce the algorithm laid out in Dynamic Textures but was stimied by Matlab's inability to read in my training videos using aviread(). After poking around online it sounds like I might have better luck with Matlab's newer video reader function called mmreader(). Unfrotunately mmreader() is part of Matlab 7.7 and I only have Matlab 7.4. I tried downloading someone elses homemade video reader but it seemed to infinitely loop or just take forever on a 5MB video.

Currently I'm out looking for small animals to sacrifice to MathWorks™....

Monday, February 9, 2009

Motion + Color

So I decided to hack around for a bit and combine motion and color detection to get a slow, moving fire-colored object detection algorithm. I'm sure there's an abundance of need for just this type of algorithm. Expect to see this in the top 100 best inventions of 2009.

Anyway, the algorithm simply uses frame differencing to detection motion in the video. The algorithm then tracks each pixel that has moved in k frames and runs color detection on those pixels. Fire colored moving pixels are then colored red for the pesky human operator to enjoy.

You can see the results of a 1 frame motion memory in a fire video sequence in the following photo album
njtrue/fire motion 1

You can see the results of a 5 frame motion memory in a fire video sequence in the following photo album
njtrue/fire motion 5

The algorithm works pretty well at 'ignoring' stationary fire colored objects. However, some spurious 'moving' fire colored pixels are detected. They could be removed using an eroding operation or some simple culling operation based on affinity values.

The most significant problem, other than moving fire-colored objects which are not fire, is that the color detector perceptron is still very slow for such a common operation. Each frame may get 10k-40k pixels that need to be run through the perceptron and that's expensive.

While I could create a specialized, 'fast' version of the perceptron classifier just for this problem, I don't think the speedup will be big enough to eliminate the problem. The solution has to be to reduce the number of pixels that are feed to the perceptron in the first place. At the very least I could thread different parts of the whole classification algorithm and only run the perceptron thread every m frames so that the actual detection is in real time.

Monday, February 2, 2009

Frame Differencing

Status report
  • Time spent on frame differencing? Lots.
  • Feelings about OpenCV? Hatred (even more than usual).
  • Actual progress? Far less then hoped for.
Anyway, I tried to get OpenCV to work with my fire video data. No luck. Nothing. Nada. Then I stumbled upon this little optical flow tutorial/code base by David Stavens at Stanford. It had code dealing with video data which happened to be very similar to mine. However, he also had some video data. When I ran that data through my code everything worked. So I then scoured the net for a new video converter to replace the STOIK video converter which apparently pumped out videos that were incompatible with OpenCV. I ended up using FormatFactory video converter to reconvert all of my fire videos from wmv to avi.

Then I mucked around with OpenCV to get it to do frame differencing on the fire videos. The battle was long and fierce, and for a while things were looking grim. However, I managed to passify OpenCV and get a simple frame differencing program working. The following are the results.

Tuesday, January 27, 2009

I'm fired up!

Alrighty! Finally exterminated all of the bugs in the perceptron code. The following are the characteristics of the multilayer perceptron that I use for fire color detection:
  • 2 layer
  • 1 hidden node
  • 3 inputs, where each input is a color channel (e.g. R,G,B or H,S,V etc.)
  • trained for 100 epochs and got a mean squared error of 0.0033
More than 1 hidden node don't seem to help much and only end up slowing the classifier. Currently the classifier is trained on RGB color data.

The following is an example of an image with fire in it and how the perceptron-based fire color classifier would label the image pixels:

At first glance you might say that fire color classification fails miserably because it tends to label lots of non fire pixels as being fire. However, the goal of this first classifier was to get a very high true positive rate and a low false negative rate, irregardless of the false positive rate. This is because this classifier will be used in concert with other feature detection algorithms such as motion and maybe texture. These additional classification algorithms will help to reduce the false positive rate.

Lastly, you can see the new color palettes of the positive (fire) data and negative (non-fire) data:

Saturday, January 24, 2009

No luck with Perceptrons...yet

Well it turns out there were a few bugs/problems with my perceptron code. First off there was a problem reading in the input data which threw off the z-scaling. Then the z-scaling had some overflow problems which I fixed. Lastly, the initial weights were being set badly causing the perceptron to suck. I finally got it to work correctly on XOR.

Unfortunately when I tried to get it to learn the difference between fire and non-fire pixels it could only get an MSE of .18 which is a big failure. I suspect there's much too much similarity and overlap between fire pixels and non fire pixels, especially in the near white color range. I'm going to try and manipulate the training data to reduce this overlap to see if that helps.

If that doesn't work then I'm going to try seeing if learning HSV channels or L*a*b* channels will work bettern than the RGB channels that I've tried so far.