Asurvey on Visual object tracking basedon biologically inspired trackerAbstractVisualtracking is the way toward finding, recognizing, and deciding the dynamicdesign of one or numerous moving (potentially deformable) protests (or parts ofitems) in each casing of one or a few cameras.To build an general trackingsystem,the recent progress in image representation,appearance model and motion model are briefly reviewed in this paper. Fortracking either single target or multiple targets the models which is reviewed here is basic enough to beapplied.
The appearance model which is trending in recent time is given aspecial attention. The key techniques and the factors which is tedious for thetracker to find the object appearance changes are camera motion, illuminationvariation,shape deformation and partial occlusion is discussed. For tracking-by-detection and on-line boosting methods the state-of-the-art performs well(e.g TLD,Online Boost,MIL-Track,).
Hence based onthis we check them together for a single person tracking. Keywords:Tracking,target ,models,appearance IntroductionObjecttracking plays a vital role in the field of computation. In a video there isdifferent frames where the object changes around the spot.
The tracker findsthe part in a frame of an object that is similar to the original thisis known as object tracking. The role of object tracking in following tasks: · Vehicle navigation – the obstacle avoidance andvideo-based route planning.· Video indexing – the renewal of videos in database.· Surveillance – to monitor the suspicious activites.
· Traffic monitoring – to gather the information oftraffic status.Ithas a wide assortment of uses including movement investigation, videoobservation, human PC connection and robot discernment. Theobject tracking can be tedious due toreal time processing requirements,disturbance in images, Information lossage.It has been seriously explored in theprevious decade.
To enhance visual object tracking, one may need to address these difficulties by growing betterelement portrayals of visual targets and more compelling following models.Inthe following , the image representation and appearance models are discussedbriefly.1.
Image representationInimage representation,the features can be represented in texture,points,contoursand shape.For tracking the object detection can be adapted from anyrepresentation.For example,ships at sea, cars on road and fishes in tank etc.Inthis section,first the typical image features and object shape representationscommonly employed for tracking and then address the joint shape representationsis described.1 1.1Typical Image FeaturesIn 3,Xiang describes about Color features (e.g. color histogram) have low computationalcost and are invariant to point-to-point transformations.
In 5, Efros selectsthe best color features from five color spaces to model skin color for facingtracking. However, they are not robust against illumination changes. Also, theyare not discriminative enough due to lack of spatial information. –Texture features.
(e.g. LBP) havehigh discriminative ability, though being computationally expensive. Nguyen andSmeulders classify texture features using LDA 6. 1.
2Shape representation –Points.Ingeneral, the point representation is suitable for tracking objects that occupysmall regions in an image1. The object is represented by a point, that is,the centroid (Figure 1(a))7or by a set of points (Figure 1(b)) 8. –Primitivegeometric shapes.Object motion for such representations isusually modeled by translation, affine, or projective (homography)transformation1 .Object shape is represented by a rectangle, ellipse (Figure1(c), (d) 9, etc Though primitive geometric shapes are more suitable for representingsimple rigid objects, they are also used for tracking nonrigid objects1. Fig. 1.
Object representations. (a)Centroid, (b) multiple points, (c) rectangular patch, (d) elliptical patch, (e)part-based multiple patches, (f) object skeleton, (g)complete object contour,(h) control points on object contour, (i) object silhouette. –Objectsilhouette and contour. Contour representation defines theboundary of anobject(Figure 1(g), (h).
The region inside the contour is called the silhouette oftheobject.1 —Articulatedshape models. Articulated objects are composed of bodyparts that are held together with joints. For example, the human body is anarticulated object with torso, legs, hands, head, and feet connected by joints.The relationship between the parts are governed by kinematic motion models, forexample, joint angle, etc. In order to represent an articulated object, one canmodel the constituent parts using cylinders or ellipses as shown in Figure 1(e)1 2.
ApearancemodelInreal-world surveillance scenes, target appearance tends to change duringtracking (i.e. variation in target appearance) and background mayinclude moving objects (i.e. variation in the scene). The less associatedthe target’s appearance model is with those variations, the more specific it isin representing that particular object. Then, the tracker is less likely to getconfused with other objects or background clutter3.
2.1Kernel-basedgenerative appearance models (KGAMs)Kernel-basedgenerative appearance models (KGAMs) utilize kernel density estimation toconstructkernel-based visual representations, and then carry out the mean shift forobjectlocalization.Itis divided into branches: color-driven KGAMs, shapeintegrationKGAMs,non-symmetric KGAMs 2. Color-drivenKGAMs-The color-driven KGAM 9 builds a colorhistogram-basedvisual representation regularized by a spatially smooth isotropic kernel.
However,the tracker Comaniciu et al. 2003 only considers color information andthereforeignoresother useful information such as edge and shape, resulting in the sensitivitytobackgroundclutters and occlusions.Shape-integrationKGAMs-The main aim of shape-integration is to build a a kerneldensityfunction in the joint color-shape space. It is based on two spatiallynormalized and rotationally symmetric kernels for describing the informationabout the color and object boundary2Non-symmetricKGAMs- The conventional KGAMs use a symmetric kernel(e.
g., a circle oranellipse), leading to a large estimation bias in the process of estimating thecomplicatedunderlyingdensity function. the non-symmetric KGAM needs to simultaneously estimatetheimage coordinates, the scales, and the orientations in a few number of meanshift iterations.2 2.
2Boosting-based discriminative appearance modelsThe visual object tracking is widely used inboosting-based discriminative appearance models (BDAMs) because of theirpowerful discriminative learning capabilities.It is classified intoselflearning and co-learning BDAMs.To guide the task of object/non-objectclassification from single source self-learning BDAMs is used and theco-learning BDAMs performs the multi-source discriminative information forobject detection. In 2BDAMs also take different strategies for visual representation, i.e.
, single-instanceand multi-instance ones. The single-instance BDAMs require precise object localization.If a precise object localization is not available, these tracking algorithmsmay use sub-optimal positive samples to update their corresponding object ornon-object discriminative classifiers, which may lead to a model drift problem.Moreover, object detection or tracking has its own inherent ambiguity, that is,precise object locations may be unknown even for human labelers. To deal withthis ambiguity, the multi-instance BDAMs are proposed to represent an object bya set of image patches around the tracker location. Thus, they can be furtherclassified into single-instance or multi-instance BDAMs.
2.3Randomizedlearning-based discriminative appearance models (RLDAMs)In principle, randomized learning techniques canbuild a diverse classifier ensemble by performing random input selection andrandom feature selection. In contrast to boosting and SVM, they are morecomputationally efficient, and easier to be extended for handling multi-classlearning problems. However, their tracking performance is unstable fordifferentscenes because of their randomfeature selection.
2 2.4Discriminantanalysis-based discriminative appearance models (DADAMs)Discriminant analysis is a powerful tool forsupervised subspace learning. In principle, its goal is to find alow-dimensional subspace with a high inter-class separability.
According to thelearning schemes used, it can be split into two branches: conventionaldiscriminant analysis and graph-driven discriminant analysis.In general,conventional DADAMs are formulated in a vector space while graph-driven DADAMsutilize graphs for supervised subspace learning2. 3.Motion modelThe motion model is essentially a problem of featurematching which is discussed briefly. The methods are optical flow model and Bayesian filtering.The latter two are more widely used.3.
1. Optical FlowThe optical flow method is based on the assumptionof constant lightness across frames. That is true, if the illuminationcondition does not have drastic change or the frame rate is high. 3.2.
Bayesian FilteringFrameworkInthe Bayesian filtering framework 10 (e.g. Kalman filter, particlefilter), we want to recursively estimate the current target state vector eachtime a new observation isreceived.We use t z and t x to respectively represent the target’s motionstate and appearance (e.g. positive/negative sample) at time t. In 3 There are several methods to validate on a challenging sequence.
The prototype is avision-guided mobile robot following a person. Primary challenges come from: 1)Human back tracking: since the target person is commonly back to the camera,face or frontal information is not available. 2)Background distraction: the target person walks down a corridor, wearing awhite uniform similar with the white wall. 3)Scale variation: the scales changes between 120% and 50% with respect to theinitial size.
4) Pose and shape variation:whenthe person turns around at the corner, there exists pose variation. 5)Lightness variation: the indoor scene is generally not very bright, but withlights flashing and strong sunlight through the window. 6)Occlusion from objects with similar appearance. Conclusion Inthis work the survey of appearance models, image representation and motionmodels is presented.In2 the visual representations focus more on how torobustly describe the spatio-temporal characteristics of object appearance,while the statistical modeling schemes for tracking-by-detection put moreemphasis on how to capturethe generative appearance model information of theobject regions.
These modules are closely related and interleaved with eachother. In practice, powerful appearance models depend on not only effectivevisual representations but also robust statistical models. Optical flow basedmotion estimation suffers mismatches of feature points.
In scenes with drasticillumination change, the hypothesis of constant lightness may be nottrue. References1 Yilmaz, A., Javed, O., and Shah,M.
2006. Object tracking: A survey. ACM Comput. Surv. 38, 4, Article 13(Dec. 2006), 45 pp 2. Xi Li2;1, Weiming Hu2, ChunhuaShen1, Zhongfei Zhang3, Anthony Dick1, Anton van denHengel1. A Survey of Appearance Models in VisualObject Tracking.
Volume 4 Issue4, September 2013 Article No. 58 3..K. Cannons, “A review of visual tracking,” Dept.
Comput. Sci. Eng., YorkUniv., Toronto, ON, Canada, Tech. Rep.
CSE-2008-07, 2008. 4.A. W. M. Smeulders, D.
M. Chu, R. Cucchiara, S. Calderara, A.
Dehghan, and M.Shah, “Visual tracking: An experimental survey,” IEEE Trans. Pattern Anal.Mach. Intell., vol. 36, no. 7, pp.
1442–1468, Jul. 2014. 5.B.
Efros. Adaptive Color Space Switching for Face Tracking in Multi ColoredLighting Environments. In FG, 2002.
6.H. Nguyen and A. Smeulders. Robust Tracking using Foreground Background TextureDiscrimination.
IJCV, 69(3): 277-293, 2006. 7.VEENMAN, C., REINDERS, M.
, AND BACKER, E. 2001. Resolving motion correspondencefor densely moving points. IEEE Trans.
Patt. Analy. Mach. Intell.
23, 1,54–72. 8.SERBY, D., KOLLER-MEIER, S., AND GOOL, L. V.
2004. Probabilistic objecttracking using multiple features. In IEEE International Conference of PatternRecognition (ICPR). 184–187. 9.COMANICIU, D., RAMESH, V., ANDMEER, P.
2003. Kernel-based object tracking. IEEETrans. Patt.
25, 564–575. 10. A. Doucet, S. Godsilland C. Andrieu.
On sequential Monte Carlo sampling methods for Bayesianfiltering. Statistics and Computing, 10(3): 197-208,2000.