Let's talk about video tracking

A Few Principles of Video Tracking

The idea of tracking motion on a computer using a video camera has been around a couple of decades, and still is not fully perfect, because the construction of vision is a complex subject. We don't just "see"; we construct colors, edges, objects, depth, and other aspects of vision from the light that reaches our retinas. If you want to program a computer to see in the same way, it has to have subroutines that define the characteristics of vision and allow it to distinguish those characteristics in the array of pixels that comes from a camera. For more on that, see Visual Intelligence: How We Create What We See by Donald Hoffman. There are many other texts on the subject, but his is a nice popular introduction. What follows is a very brief introduction to some of the basic concepts behind computer vision and video manipulation.

There are a number of toolkits available for getting data from a camera and manipulating it. They vary from very high-level simple graphical tools to low-level tools that allow you to manipulate the pixels directly. Which one you need depends on what you want to do. Regardless of your application, the first step is always the same: you get the pixels from the camera in an array of numbers, one frame at a time, and do things with the array. Typically, your array is a list of numbers, including the location, and the relative levels or red, green, and blue light at that location.

There are a few popular applications that people tend to develop when they attach a camera to a computer:

Video manipulation takes the image from the camera, changes it somehow, and re-presents it to the viewer in changed form. In this case, the computer doesn't need to be able to interpret objects in the image, because you're basically just applying filters, not unlike Photoshop filters.

Tracking looks for a blob of pixels that's unique, perhaps the brightest blob, or the reddest blob, and tracks its location over a series of frames. Tracking can be complicated, because the brightest blob from one frame to another might not be produced by the same object.

Object recognition looks for a blob that matches a particular pattern, like a face, identifies that blob as an object, and keeps track of its location over time. Object recognition is the hardest of all three applications, because it involves both tracking and pattern recognition. If the object rotates, or if its colors shift because of a lighting change, or it gets smaller as it moves away from the camera, the computer has to be programmed to compensate. If it's not, it may fail to "see" the object, even though it's still there.

There are a number of programs available for video manipulation. Jitter, a plugin for Max/MSP, is a popular one. David Rokeby's softVNS is another plugin for Max. Mark Coniglio's Isadora is a visual programming environment like Max/MSP that's dedicated to video control, optimized for live events like dance and theatre. Image/ine is similar to Isadora, though aging, as it hasn't been updated in a couple of years. There also countless VJ packages that will let you manipulate live video. In addition, most text-based programming languages have toolkits too. Danny Rozin's TrackThemColors Pro does the job for Macromedia Director MX, as does Josh Nimoy's Myron. Myron also works for Processing. Dan O'Sullivan's vbp does the job for Java. Dan has an excellent site on the subject as well, with many more links. He's also got a simple example for Processing on his site. Almost all of these toolkits can handle video tracking as well.

There are two methods you'll comm,only find in video tracking software: the zone approach and the blob approach. Software such as softVNS or Eric Singer's Cyclops or cv.jit (a plugin for jitter that affords video tracking) take the zone approach. They map the video image into zones, and give you information about the amount of change in each zone from frame to frame. This is useful if your camera is in a fixed location, and you want fixed zones of that trigger activity. Eric has a good example on his site in which he uses Cyclops to play virtual drums. The zone approach makes it difficult to track objects across an image, however. TrackThemColors and Myron are examples of the blob approach, in that they return information about unique blobs within the image, making it easier to track an object moving across an image.

Tom Igoe from tigoe.com

No Response to "Let's talk about video tracking"

Post a Comment