VISAPP 2011 Abstracts


Area 1 - Image Analysis

Full Papers
Paper Nr: 8
Title:

PALM SHAPE COMPARISON FOR PERSON RECOGNITION

Authors:

Irina Bakina

Abstract: The article presents a new method for palm comparison based on the alignment of palm shapes. The proposed approach allows comparison and recognition of palms with sticking fingers. The existing methods do not work correctly in this case, while it frequently appears in real environments (mostly among elderly people). The idea of the proposed method is to model a ”posture” of test palm on reference palm. The form of flexible object is used for palm modeling. This representation provides a convenient tool for applying palm transformations and allows us to perform them in real-time mode. Low resolution webcams can be used for palm image acquisition. The article also introduces the application of person recognition based on the proposed comparison. At the end of the article the problem of improving recognition characteristics of palm is addressed. Particularly, it provides a bimodal approach that employs palm and voice features.
Download

Paper Nr: 67
Title:

ELASTIC REGISTRATION OF EDGE SETS BY MEANS OF DIFFUSE SURFACES - With an Application to Embedding Purkinje Fiber Networks

Authors:

Stefan Fürtinger, Stephen Keeling, Gernot Plank and Anton J. Prassl

Abstract: In this work, edge sets are mapped one to the other by representing these zero area sets as diffuse images which have positive measure supports that can be registered elastically. The driving application for this work is to map a Purkinje fiber network in the epicardium of one heart to the epicardium of another heart. The approach is to register sufficiently accurate diffuse surface representations of two epicardia and then to apply the resulting transformation to the points of the Purkinje fiber network. To create a diffuse image from a given edge set, a region growing method is used to approximate diffusion of brightness from an edge set to a given point. To be minimized is the sum of squared differences of the registered diffuse images along with a linear elastic penalty for the registration. A Newton iteration is employed to solve the optimality system, and the degree of diffusion is larger in initial iterations while smaller in later iterations so that a desired local minimum is selected by means of vanishing diffusion. Favorable results are shown for registering highly detailed rabbit heart models.
Download

Paper Nr: 91
Title:

GROUPWISE NON-RIGID IMAGE ALIGNMENT - Results of a Multiscale Iterative Best Edge Algorithm

Authors:

Bernard Tiddeman and David Hunter

Abstract: In this paper we present an algorithm for groupwise image alignment using an iterative best edge point algorithm. Neighbouring image edge points are matched for similarity in a two directional fashion. The matches found are used to drive a regularised warp of the images into alignment. The algorithm works from low to high resolution, with the matches calculated across the set first at low resolution and towards progressively finer scales. The regularisation decreases across iterations, and the search area remains constant, so covers larger effective area in the low resolution images. We also extend the method to 3D surfaces by combining the 2D image search with a 3D ICP algorithm. The results show that this gives a very efficient algorithm that can align many different sets of 2D images and 3D surfaces.
Download

Paper Nr: 116
Title:

THE SPIRAL FACETS - A Unified Framework for the Analysis and Description of 3D Facial Mesh Surfaces

Authors:

Naoufel Werghi, Harish Bhaskar, Youssef Meguebli and Haykel Boukadida

Abstract: In this paper, we describe a framework for encoding 3D facial triangular mesh surface. We derive shape information from the triangular mesh surface by exploiting specific arrangements of facets in the model. We describe the foundations of the framework and adapt the framework for several original applications including: face landmark detection, frontal face extraction, face orientation and facial surface representation. We validate the framework through experimentation with raw 3D face mesh surfaces and demonstrate that the model allows simpler implementation, more compact representation and encompasses rich shape information that can be usefully deployed both locally and globally across the face in comparison to other standard representations.
Download

Paper Nr: 121
Title:

TEXTURE REMOVAL IN COLOR IMAGES BY ANISOTROPIC DIFFUSION

Authors:

Baptiste Magnier, Philippe Montesinos and Daniel Diep

Abstract: In this paper, we present a new method for removing texture in color images using a smoothing rotating filter. From this filter, a bank of smoothed images provides pixel signals for each channel image able to classify a pixel belonging to a texture region. We apply this classification in each channel image in order to compute two directions for the anisotropic diffusion. Then, we introduce a new method for vector anisotropic diffusion which controls accurately the diffusion near edge and corner points and diffuses isotropically inside textured regions. Several results applied on real images and a comparison with vector anisotropic diffusion methods show that our model is able to remove the texture and control the diffusion.
Download

Paper Nr: 122
Title:

ADAPTIVE BACKGROUND SUBTRACTION IN H.264/AVC BITSTREAMS BASED ON MACROBLOCK SIZES

Authors:

Antoine Vacavant, Lionel Robinault, Serge Miguet, Chris Poppe and Rik van de Walle

Abstract: In this article, we propose a novel approach to detect moving objects in H.264 compressed bitstreams. More precisely, we describe a multi-modal background subtraction technique that uses the size of macroblocks in order to label them as belonging to the background of the observed scene or not. Here, we integrate an adaptive Gaussian mixture-based scheme to model the background. We evaluate our contribution using the PETS video dataset and a realist synthetic video sequence rendered by a 3-D urban environment simulator. We compare two different background models, and we show that the Gaussian mixture-based is the best and outperforms other techniques that use macro bloc sizes.
Download

Paper Nr: 142
Title:

METHOD OF EXTRACTING INTEREST POINTS BASED ON MULTI-SCALE DETECTOR AND LOCAL E-HOG DESCRIPTOR

Authors:

Manuel Grand-brochier, Christophe Tilmant and Michel Dhome

Abstract: This article proposes an approach to extraction (detection and description) of interest points based Fast-Hessian and E-HOG. SIFT and SURF are the two most used methods for this problem and their studies allow us to understand their construction and extract the various advantages (invariances, speeds, repeatability). Our goal is, firstly, to couple these advantages to create a new system (detector, descriptor and matching) and, secondly, to determine the characteristic points for different applications (image transformation, 3D reconstruction...). Our system must also be as invariant as possible for the image transformation (rotations, scales, viewpoints for example). Finally, we have to find a compromise between a good matching rate and the number of points matched. All the detector and descriptor parameters (orientations, thresholds, analysis shape) will be also detailed in this article.
Download

Paper Nr: 147
Title:

AN EFFECTIVE METHOD FOR COUNTING PEOPLE IN VIDEO-SURVEILLANCE APPLICATIONS

Authors:

D. Conte, P. Foggia, G. Percannella, F. Tufano and M. Vento

Abstract: This paper presents a method to count people for video surveillance applications. The proposed method adopts the indirect approach, according to which the number of persons in the scene is inferred from the value of some easily detectable scene features. In particular, the proposed method first detects the SURF interest points associated to moving people, then determines the number of persons in the scene by a weigthed sum of the SURF points. In order to take into account the fact that, due to the perspective, the number of points per person tends to decrease the farther the person is from the camera, the weight attributed to each point depends on its coordinates in the image plane. In the design of the method, particular attention has been paid in order to obtain a system that can be easily deployed and configured. In the experimental evaluation, the method has been extensively compared with the algorithms by Albiol et al. and by Conte et al., which both adopt a similar approach. The experimentations have been carried out on the PETS 2009 dataset and the results show that the proposed method obtains a high value of the accuracy. In the experimental evaluation, the method has been extensively compared with the algorithms by Albiol et al. and by Conte et al., which both adopt a similar approach. The experimentations have been carried out on the PETS 2009 dataset and the results show that the proposed method obtains a high value of the accuracy.
Download

Paper Nr: 165
Title:

INTERPOLATION BETWEEN IMAGES BY CONSTRAINED OPTIMAL TRANSPORT

Authors:

Said Kerrache and Yasushi Nakauchi

Abstract: In this paper, the recently proposed technique of constrained optimal transport is used to interpolate between images under specified constraints. The intensity values in both images are considered as mass distributions, and a flow of minimum kinetic energy is computed to transport the initial distribution to the final one, while satisfying specified constraints on the intermediate mass as well as the the velocity or the momentum field. As an application, the proposed method is used for interpolating between images under constraint on the volume expansion and contraction. This is achieved by imposing bounds on the divergence of the velocity field of the flow. This constraint is discretized then integrated into the problem Lagrangian using the augmented Lagrangian method. A variation of the solution is also presented, where the constraint is decoupled into two constraints coordinated by an additional Lagrange multiplier. This allows a considerable speedup, though numerical robustness decreases in certain cases. Constrained image interpolation by optimal transport has potential applications in image registration. In particular, the proposed method for controlling the volume change has potential application in registration of images under volume change constraints as it is the case for medical images depicting muscle movements or those with contrast enhancing structures.
Download

Paper Nr: 166
Title:

COMPUTATIONAL SYMMETRY VIA PROTOTYPE DISTANCES FOR SYMMETRY GROUPS CLASSIFICATION

Authors:

M. Agustí-Melchor, Ángel Rodas-Jordá and J. M. Valiente-González

Abstract: Symmetry is an abstract concept that is easily noticed by humans and as a result designers make new creations based on its use, e.g. textile and tiles. Images of these designs belong to a more general group called wallpaper images, and these images exhibit a repetitive pattern on a 2D space. In this paper, we present a novel computational framework for the automatic classification into symmetry groups of images with repetitive patterns. The existing methods in the literature, based on rules and trees, have several drawbacks because of the use of thresholds and heuristics. Also, there is no way to give some measurement of the classification goodness-of-fit. As a consequence, these methods have shown low classification values when images exhibit imperfections due to the manufacturing process or hand made process. To deal with these problems, we propose a classification method that can obtain an automatic parameter estimation for symmetry analysis. Using this approach, the image classification is redefined as distance computation to the binary prototypes of a set of defined classes. Our experimental results improve the state of the art in symmetry group classification methods.
Download

Short Papers
Paper Nr: 31
Title:

FAST DEPTH-INTEGRATED 3D MOTION ESTIMATION AND VISUALIZATION FOR AN ACTIVE VISION SYSTEM

Authors:

M. Salah E.-N. Shafik and Bärbel Mertsching

Abstract: In this paper, we present a fast 3D motion parameter estimation approach integrating the depth information acquired by a stereo camera head mounted on a mobile robot. Afterwards, the resulting 3D motion parameters are used to generate and accurately position motion vectors of the generated depth sequence in the 3D space using the geometrical information of the stereo camera head. The proposed approach has successfully detected and estimated predefined motion patterns such as motion in the Z direction and motion vectors pointing to the robot which is very important to overcome typical problems in autonomous mobile robotic vision such as collision detection and inhibition of the ego-motion defects of a moving camera head. The output of the algorithm is part of a multi-object segmentation approach implemented in an active vision system.
Download

Paper Nr: 37
Title:

AUTOMATIC FACADE IMAGE RECTIFICATION AND EXTRACTION USING LINE SEGMENT FEATURES

Authors:

Chun Liu

Abstract: Recently image based facade modeling has attracted significant attention for 3D urban reconstruction because of low cost on data acquisition and large amount of available image processing tools. In image based facade modeling, it normally requests a rectified and segmented image input covering only the facade region. Yet this requirement involves heavy manual work on the perspective rectification and facade region extraction. In this paper, we propose an automatic rectification and segmentation process using line segment features. The raw input image is firstly rectified with the help of two vanishing points estimated from line segment in the image. Then based on the line segment spatial distribution and the luminance feature, the facade region is extracted from the sky, the road and the near-by buildings. The experiments show this method successfully work on Paris urban buildings.
Download

Paper Nr: 47
Title:

MULTI-RESOLUTION VIRTUAL PLANE BASED 3D RECONSTRUCTION USING INERTIAL-VISUAL DATA FUSION

Authors:

Hadi Aliakbarpour and Jorge Dias

Abstract: In this paper a novel 3D volumetric reconstruction method, based on the fusion of inertial and visual information and applying a quadtree-based compression algorithm, has been proposed. A network of cameras is used to observe the scene. Then beside of each camera, a fusion-based virtual camera is defined. The transformations among the cameras have been estimated. Then a set of horizontal virtual planes have been passed through the volumetric scenes. The intersections of these virtual planes and the object within the scene, or in other words the virtual registration layers, have been obtained by using the concept of homography. Then quadtree-based decomposition has been applied to the registration layers and consequently the obtained layers (2D) are stacked to produce the 3D reconstruction of the object. The proposed method has the ability of adjusting the compactness or the resolution of the result which can be defined with respect to the application or the storage resources, specially when the intention is to keep the sequence of 3D models in a dynamic scene.
Download

Paper Nr: 58
Title:

PERSPECTIVE-THREE-POINT (P3P) BY DETERMINING THE SUPPORT PLANE

Authors:

Zhaozheng Hu and Takashi Matsuyama

Abstract: This paper presents a new approach to solve the classic perspective-three-point (P3P) problem. The basic conception behind is to determine the support plane, which is defined by the three control points. Computation of the plane normal is formulated as searching for the maximum likelihood on the Gaussian hemisphere by exploiting the geometric constraints of three known angles and length ratios from the control points. The distances of the control points are then computed from the normal and the calibration matrix by homography decomposition. The proposed algorithm has been tested with real image data. The computation errors for the plane normal and the distances are less than 0.35 degrees, and 0.8cm, respectively, within 1~2m camera-to-plane distances. The multiple solutions to P3P problem are also illustrated.
Download

Paper Nr: 86
Title:

ACCURATE EYE CENTRE LOCALISATION BY MEANS OF GRADIENTS

Authors:

Fabian Timm and Erhardt Barth

Abstract: The estimation of the eye centres is used in several computer vision applications such as face recognition or eye tracking. Especially for the latter, systems that are remote and rely on available light have become very popular and several methods for accurate eye centre localisation have been proposed. Nevertheless, these methods often fail to accurately estimate the eye centres in difficult scenarios, e.g. low resolution, low contrast, or occlusions. We therefore propose an approach for accurate and robust eye centre localisation by using image gradients. We derive a simple objective function, which only consists of dot products. The maximum of this function corresponds to the location where most gradient vectors intersect and thus to the eye’s centre. Although simple, our method is invariant to changes in scale, pose, contrast and variations in illumination. We extensively evaluate our method on the very challenging BioID database for eye centre and iris localisation. Moreover, we compare our method with a wide range of state of the art methods and demonstrate that our method yields a significant improvement regarding both accuracy and robustness.
Download

Paper Nr: 95
Title:

LEARNING OBJECT DETECTION USING MULTIPLE NEURAL NETWORKS

Authors:

Ignazio Gallo and Angelo Nodari

Abstract: Multiple neural network systems have become popular techniques for tackling complex tasks, often giving improved performance compared to a single network. In this study we propose an innovative detection algorithm in image analysis using a multiple neural network approach where many neural networks are jointly used to solve the object detection problem. We use a group of networks configured with different parameters and features, then combines them in order to obtain new networks. The topology of the set of neural networks is statically configured as a tree where the root node produces in output the detection map. This work represents a preliminary study through which we want to move from detection to segmentation and recognition of objects of interest. We have compared our model with other detection algorithms using a standard dataset and the results are encouraging. The results highlight the advantages and problems that will guide the evolution of the proposed model.
Download

Paper Nr: 108
Title:

SURFACE RECONSTRUCTION FOR GENERATING DIGITAL MODELS OF PROSTHESIS

Authors:

Luiz C. M. de Aquino, Diego A. T. Q. Leite, Gilson A. Giraldi, Jaime S. Cardoso, Paulo Sergio S. Rodrigues and Luiz A. P. Neves

Abstract: The restoration and recovery of a defective skull can be performed through operative techniques to implant a customized prosthesis. Recently, image processing and surface reconstruction methods have been used for digital prosthesis design. In this paper we present a framework for prosthesis modeling. Firstly, we take the computed tomography (CT) of the skull and perform bone segmentation by thresholding. The obtained binary volume is processed by morphological operators, frame-by-frame, to get the inner and outer boundaries of the bone. These curves are used to initialize a 2D deformable model that generates the prosthesis boundary in each CT frame. In this way, we can fill the prosthesis volume which is the input for a marching cubes technique that computes the digital model of the target geometry. In the experimental results we demonstrate the potential of our technique and compare it with a related one.
Download

Paper Nr: 136
Title:

NOVEL ADAPTIVE EDGE DETECTION ALGORITHM USING HAAR-LIKE FEATURES

Authors:

Mircea Popa, Andras Majdik and Gheorghe Lazea

Abstract: This paper presents an adaptive method of the edge detection problem, based on the algorithm of Canny. It is designed to be used in the real-scene object recognition problems in those cases where, because of the complexity of the environment’s structure and the time-varying illumination, regular edge detection algorithms fail to offer a good and stabile response. The algorithm is based on the same principle as Canny’s method, but the hysteresis threshold values are adapted for each pixel considering the local approximation of the gradient value. The gradients are approximated by Haar-like features, computed with integral images in constant time. In terms of edge extraction, the proposed algorithm improves the performance obtained with the method of Canny in complex lightening conditions. It also provides to the user more control over the detection process and assures a more stable result concerning the illumination conditions. The results of the proposed algorithm are compared with those obtained with the classic method of Canny for edge detection in real scenarios. Both implementations use the speed-optimized functions of Open Computer Vision (OpenCV) Library.
Download

Paper Nr: 150
Title:

FACE RECONSTRUCTION WITH STRUCTURED LIGHT

Authors:

John Congote, Iñigo Barandiaran, Javier Barandiaran, Marcos Nieto and Oscar Ruiz

Abstract: This article presents a methodology for reconstruction of 3D faces which is based on stereoscopic images of the scene using active and passive surface reconstruction. A sequence of Gray patterns is generated, which are projected onto the scene and their projection recorded by a pair of stereo cameras. The images are rectified to make coincident their epipolar planes and so to generate a stereo map of the scene. An algorithm for stereo matching is applied, whose result is a bijective mapping between subsets of the pixels of the images. A particular connected subset of the images (e.g. the face) is selected by a segmentation algorithm. The stereo mapping is applied to such a subset and enables the triangulation of the two image readings therefore rendering the (x;y; z) points of the face, which in turn allow the reconstruction of the triangular mesh of the face. Since the surface might have holes, bilateral filters are applied to have the holes filled. The algorithms are tested in real conditions and we evaluate their performance with virtual datasets. Our results show a good reconstruction of the faces and an improvement of the results of passive systems.
Download

Paper Nr: 152
Title:

PALMPRINT RECOGNITION BASED ON PRINCIPAL COMPONENT NEURAL NETWORK

Authors:

Azadeh Ghandehari

Abstract: Principal Component Analysis (PCA) has proven to be very successful in image recognition. PCA is performed by a neural network called Principal Component Neural Network (PCNN). PCNN can extract the principal component in its weights by learning from the input data. This neural model does not need to have large memory matrices such as the covariance matrix or the entire data matrix. This certain feature of PCNN render more economical requirements of computer CPU time and memory in comparison with the conventional PCA. In this paper we propose PCNN model to deal with palmprint feature extraction issue and for its classification, we use Euclidean Distance metric. The results of our experiments indicate that, PCNN can extract the principal components effectively and it further demonstrates that there is a high degree of similarity between PCNN and PCA classification performance.

Paper Nr: 162
Title:

A PROCEDURE FOR AUTOMATED REGISTRATION OF FINE ART IMAGES IN VISIBLE AND X-RAY SPECTRAL BANDS

Authors:

Dmitry Murashov

Abstract: This paper presents a two-step procedure for automated registration of photographs and roentgenograms of fine art paintings. Grayscale local maxima in blurred images are used as the control points. The coherent point drift (CPD) point sets matching algorithm is combined with iterative procedure for excluding false correspondences. General projective transformation model is used for image registration. The precise step of the procedure reduces registration error obtained at the coarse step.
Download

Paper Nr: 163
Title:

SIMULTANEOUS ESTIMATION OF LIGHT SOURCES POSITIONS AND CAMERA ROTATION

Authors:

Masahiro Oida, Fumihiko Sakaue and Jun Sato

Abstract: For mixed reality and other applications, it is very important to achieve photometric and geometric consistency in image synthesis. This paper describes a method for calibrating camera and light source simultaneously from photometric and geometric constraints. In general, feature points in a scene are used for computing camera positions and orientations. On the other hand, if the cameras and objects are sticked and move together, the changes in shading information of the objects in images also include useful information on geometric camera motions. In this paper, we show that if we use both shading information and feature point information, we can calibrate cameras from smaller number of feature points than the existing methods. Furthermore, it is shown that the proposed method can calibrate light sources as well as cameras. The accuracy of the proposed method is evaluated by using real and synthetic images.
Download

Paper Nr: 164
Title:

RAINDROP COMPLEMENT BASED ON EPIPOLAR GEOMETRY AND SPATIOTEMPORAL PATCHES

Authors:

Kyohei Nomoto, Fumihiko Sakaue and Jun Sato

Abstract: In this paper, we propose detection and complement of raindrops on mirrors and windows onto cars. Raindrops on windows and mirrors disturb view of drivers. In general, they were removed using wiper and special devices. However, the devices cannot be used for general case. In our proposed method, images of mirrors and windows are taken by camera and raindrops are complemented on the images, virtually. The method is based on auto-epipolar geometry and concept of spatiotemporal image. By using the method, we can observe clear windows and mirrors only if we can take them by camera.
Download

Paper Nr: 167
Title:

TWO-LEVEL STRATEGY FOR IMAGE BOUNDARY DETECTION

Authors:

Karin S. Komati, Evandro O. T. Salles and Mario Sarcinelli-Filho

Abstract: A new method for boundary detection in natural images is here proposed, consisting of two levels, or two-stage sequential processes: embedded integration and post-processing integration. In the embedded integration, two different methods to measure homogeneity in region-growing technique are integrated, based on a global statistical property: the shape of the power spectrum of the image being analyzed. One homogeneity measure is the J value (provided by the classical JSEG algorithm) and the second measure is a multifractal measurement. This first step provides a region extraction. In the second level, edge information is extracted by a classical method, and integrated with region information. This structure, called KSS, eliminates false boundaries in the region map, guided by the edge map, and the noise in edge map as well, now guided by the region map, thus taking the advantage of their complementary nature. Experiments on a large dataset of natural color images show that the result of such two-level strategy matches the human perception better than the individual methods, quantitatively and qualitatively speaking.
Download

Paper Nr: 169
Title:

TEXTURE CLASSIFICATION USING SPARSE K-SVD TEXTON DICTIONARIES

Authors:

Muhammad Rushdi and Jeffrey Ho

Abstract: This paper addresses the problem of texture classification under unknown viewpoint and illumination variations. We propose an approach that combines sparse K-SVD and texton-based representations. Starting from an analytic or data-driven base dictionary, a sparse dictionary is iteratively estimated from the texture data using the doubly-sparse K-SVD algorithm. Then, for each texture image, K-SVD representations of pixel neighbourhoods are computed and used to assign the pixels to textons. Hence, the texture image is represented by the histogram of its texton map. Finally, a test image is classified by finding the closest texton histogram using the chi-squared distance. Initial experiments on the CUReT database show high classification rates that compare well with Varma-Zisserman MRF results.
Download

Paper Nr: 49
Title:

AFFINE SPHARM REGISTRATION - Neural Estimation of Affine Transformation in Spherical Domain

Authors:

Valentina Pedoia, Ignazio Gallo and Elisabetta Binaghi

Abstract: In this work we propose an algorithm to perform the affine 3D surface registration using the shape modeling based on SPHerical HARMonic: called SPHARM. In the existing SPHARM registration algorithms the alignment is obtained using the rotation properties, that allows to perform the 3D surface rotation transforming only the spherical coefficients. The major limit is that this approach aligns the surface only by rotation. We propose a method to generalize this solution without lose the advantage to perform whole the registration process in the spherical domain. An estimation of the coefficients transformation that guarantees an affinity in the spatial domain is obtained by regression, using a set of RBF networks. The description of the 3D surface with the spherical harmonic coefficients is brief but comprehensive and provides directly a metric of the shape similarity. Therefore, the registration is obtained aligning the SPHARM model thought the minimization of the root mean square distance between the coefficients vectors. Many experiments are performed to test the affine SPHARM registration algorithm which appears efficient and effective compared with a standard registration algorithm in the spatial domain.
Download

Paper Nr: 92
Title:

EVOLUTIONARY SUPPORT VECTOR MACHINE FOR PARAMETERS OPTIMIZATION APPLIED TO MEDICAL DIAGNOSTIC

Authors:

Ahmed Kharrat, Nacéra Benamrane, Mohamed Ben Messaoud and Mohamed Abid

Abstract: The parameter selection is very important for successful modelling of input–output relationship in a function classification model. In this study, support vector machine (SVM) has been used as a function classification tool for accurate segregation and genetic algorithm (GA) has been utilised for optimisation of the parameters of the SVM model. Having as input only five selected features, parameters optimisation for SVM is applied. The five selected features are mean of contrast, mean of homogeneity, mean of sum average, mean of sum variance and range of autocorrelation. The performance of the proposed model has been compared with a statistical approach. Despite the fact that Grid algorithm has fewer processing time, it does not seem to be efficient. Testing results show that the proposed GA–SVM model outperforms the statistical approach in terms of accuracy and computational efficiency.
Download

Paper Nr: 94
Title:

REGION-BASED OBJECTIVE EVALUATION OF POLYGONAL MESH SEGMENTATION METHODS

Authors:

Amira Zguira, Narjes Doggaz and Ezzeddine Zagrouba

Abstract: In this paper, we propose a new region-based objective evaluation approach of polygonal mesh segmentation algorithms. This approach is derived from 2D-images segmentation similarity measures. We quantify an evaluation criterion relatively to each type of segmented mesh-regions, based on a mesh classification method into convex, concave and planar regions. We apply this approach on eight wellselected existing algorithms conducted by a heterogeneous ground-truth. We present and discuss the evaluation results of these techniques by taking into account the corresponding objects’ classes in every type of region. This provides better understanding as to the strengths and weaknesses of each technique in function of each mesh-regions type. That aims to make a better choice concerning the segmentation algorithms for different applications.
Download

Paper Nr: 100
Title:

DETECTION OF POINTS OF INTEREST FOR GEODESIC CONTOURS - Application on Road Images for Crack Detection

Authors:

Sylvie Chambon

Abstract: A new algorithm of automatic extraction of thin structures in textured images is introduced, and, more specifically, is applied to detection of road cracks. The method is based on two steps: the first one consists in detecting points of interest inside the thin structure whereas the second step connects the points with a geodesic contour process. The main contribution of this work is the study of automatic detection of points of interest inside thin structures in a high-textured background. The results are compared with a Markovian segmentation.
Download

Paper Nr: 112
Title:

RECTANGULAR EMPTY PARKING SPACE DETECTION USING SIFT BASED CLASSIFICATION

Authors:

Harish Bhaskar, Naoufel Werghi and Saeed Al-Mansoori

Abstract: In this paper, we describe a method of combining rectangle detection and scale invariant feature transform (SIFT) analysis for empty parking space detection. A parking space in a parking lot is represented as a rectangular region of pixels in an image captured from an aerial camera. Detecting rectangular parking spaces in a new image involves an alternating scheme of extracting peaks from the Radon transform for the whole image and filtering them against specific geometric and spatial constraints. We then compute SIFT descriptors from these detected rectangular parking spaces and further apply supervised classification methods for detecting empty parking spaces. We demonstrate the performance of our model on several synthetic and real data.
Download

Paper Nr: 151
Title:

AUTOMATIC MESH SEGMENTATION USING ATLAS PROJECTION AND THIN PLATE SPLINE - Application for a Segmentation of Skull Ossicles

Authors:

Makram Mestiri, Sami Bourouis and Kamel Hamrouni

Abstract: Mesh segmentation has become a crucial step in many computer graphics applications. This paper provides new method for three dimension Atlas based mesh segmentation using thin plate spline approach (TPS) and a new FNN algorithm. This method consists of three steps: first, we apply a rigid registration between two meshes the atlas and the mesh to segment. The second step is the application of an elastic registration using thin plate spline method. The last step is the identification of the different regions to segment the mesh using our FNN algorithm. We tested the performance of our method on synthetic images and on a real human skull and found that the preliminary results obtained are satisfactory.
Download

Paper Nr: 180
Title:

NON-LINEAR LOW-LEVEL IMAGE PROCESSING IMPROVEMENT BY A PURPOSELY INJECTION OF NOISE

Authors:

A. Histace

Abstract: It is progressively realized that noise can play a constructive role in nonlinear formation processes. The starting point of the investigation of such useful noise effect has been the study of the Stochastic Resonance (SR) effect. The goal of this article is to propose a direct application of SR phenomenon in image processing, for the interest of SR in that domain is growing-up. As a prolongation of previous work already presented in the literature by author, we propose to quantitatively show that a purposely injection of a gaussian noise in a classical nonlinear image process, as image binarization, can play a constructive action. This work can also be interpreted as a first step for a better understanding of SR in image processing relating it to classical results obtained in a nonlinear signal processing framework for classical low-level image processing tool.
Download

Area 2 - Image Understanding

Full Papers
Paper Nr: 24
Title:

ACTIVE OBJECT CATEGORIZATION ON A HUMANOID ROBOT

Authors:

Vignesh Ramanathan and Axel Pinz

Abstract: We present a Bag of Words-based active object categorization technique implemented and tested on a humanoid robot. The robot is trained to categorize objects that are handed to it by a human operator. The robot uses hand and head motions to actively acquire a number of different views. A view planning scheme using entropy minimization reduces the number of views needed to achieve a valid decision. Categorization results are significantly improved by active elimination of background features using robot arm motion. Our experiments cover both, categorization when the object is handed to the robot in a fixed pose at training and testing, and object pose independent categorization. Results on a 4-class object database demonstrate the classification efficiency, a significant gain from multi-view compared to single-view classification, and the advantage of view planning. We conclude that humanoid robotic systems can be successfully applied to actively categorize objects - a task with many potential applications ranging from edutainment to active surveillance.
Download

Paper Nr: 33
Title:

EVENT DETECTION IN A SMART HOME ENVIRONMENT USING VITERBI FILTERING AND GRAPH CUTS IN A 3D VOXEL OCCUPANCY GRID

Authors:

Martin Hofmann, Moritz Kaiser, Nico Lehment and Gerhard Rigoll

Abstract: In this paper we present a system for detecting unusual events in smart home environments. A primary application of this is to prolong independent living for elderly people at their homes. We show how to effectively combine information from multiple heterogeneous sensors which are typically present in a smart home scenario. Data fusion is done in a 3D voxel occupancy grid. Graph Cuts are used to accurately reconstruct people in the scene. Additionally we present a joint multi object Viterbi tracking framework, which allows tracking of all people, and simultaneously detecting critical events such as fallen persons.
Download

Paper Nr: 43
Title:

AN ADAPTIVE INTERFACE FOR ACTIVE LOCALIZATION

Authors:

Kenji Okuma, Eric Brochu, David G. Lowe and James J. Little

Abstract: Thanks to large-scale image repositories, vast amounts of data for object recognition are now easily available. However, acquiring training labels for arbitrary objects still requires tedious and expensive human effort. This is particularly true for localization, where humans must not only provide labels, but also training windows in an image. We present an approach for reducing the number of labelled training instances required to train an object classifier and for assisting the user in specifying optimal object location windows. As part of this process, the algorithm performs localization to find bounding windows for training examples that are best aligned with the current classification function, which optimizes learning and reduces human effort. To test this approach, we introduce an active learning extension to a latent SVM learning algorithm. Our user interface for training object detectors employs real-time interaction with a human user. Our active learning system provides a mean performance improvement of 4.5% in the average precision over a state of the art detector on the PASCAL Visual Object Classes Challenge 2007 with an average of just 40 minutes of human labelling effort per class.
Download

Paper Nr: 44
Title:

3D OBJECT CATEGORIZATION WITH PROBABILISTIC CONTOUR MODELS - Gaussian Mixture Models for 3D Shape Representation

Authors:

Kerstin Pötsch and Axel Pinz

Abstract: We present a probabilistic framework for learning 3D contour-based category models represented by Gaussian Mixture Models. This idea is motivated by the fact that even small sets of contour fragments can carry enough information for a categorization by a human. Our approach represents an extension of 2D shape based approaches towards 3D to obtain a pose-invariant 3D category model. We reconstruct 3D contour fragments and generate what we call ‘3D contour clouds’ for specific objects. The contours are modeled by probability densities, which are described by Gaussian Mixture Models. Thus, we obtain a probabilistic 3D contour description for each object. We introduce a similarity measure between two probability densities which is based on the probability of intra-class deformations. We show that a probabilistic model allows for flexible modeling of shape by local and global features. Our experimental results show that even with small inter-class difference it is possible to learn one 3D Category Model against another category and thus demonstrate the feasibility of 3D contour-based categorization.
Download

Paper Nr: 45
Title:

MONOCULAR RECTANGLE RECONSTRUCTION - Based on Direct Linear Transformation

Authors:

Cornelius Wefelscheid, Tilman Wekel and Olaf Hellwich

Abstract: 3D reconstruction is an important field in computer vision. Many approaches are based on multiple images of a given scene. Using only one single image is far more challenging. Monocular image reconstruction can still be achieved by using regular and symmetric structures, which often appear in human environment. In this work we derive two schemes to recover 3D rectangles based on their 2D projections. The first method improves a commonly known standard geometric derivation while the second one is a new algebraic solution based on direct linear transformation (DLT). In a second step, the obtained solutions of both methods serve as seeding points for an iterative linear least squares optimization technique. The robustness of the reconstruction to noise is shown. An insightful thought experiment investigates the ambiguity of the rectangle identification. The presented methods have various potential applications which cover a wide range of computer vision topics such as single image based reconstruction, image registration or camera path estimation.
Download

Paper Nr: 74
Title:

HUMAN ACTION RECOGNITION USING DIRECTION AND MAGNITUDE MODELS OF MOTION

Authors:

Yassine Benabbas, Samir Amir, Adel Lablack and Chabane Djeraba

Abstract: This paper proposes an approach that uses direction and magnitude models to perform human action recognition from videos captured using monocular cameras. A mixture distribution is computed over the motion orientations and magnitudes of optical flow vectors at each spatial location of the video sequence. This mixture is estimated using an online k-means clustering algorithm. Thus, a sequence model which is composed of a direction model and a magnitude model is created by circular and non-circular clustering. Human actions are recognized via a metric based on the Bhattacharyya distance that compares the model of a query sequence with the models created from the training sequences. The proposed approach is validated using two public datasets in both indoor and outdoor environments with low and high resolution videos.
Download

Paper Nr: 105
Title:

TEMPORAL BAG-OF-WORDS - A Generative Model for Visual Place Recognition using Temporal Integration

Authors:

Hervè Guillaume, Mathieu Dubois, Emmanuelle Frenoux and Philippe Tarroux

Abstract: This paper presents an original approach for visual place recognition and categorization. The simple idea behind our model is that, for a mobile robot, use of the previous frames, and not only the one, can ease recognition. We present an algorithm for integrating the answers from different images. In this perspective, scenes are encoded thanks to a global signature (the context of a scene) and then classified in an unsupervised way with a Self-Organizing Map. The prototypes form a visual dictionary which can roughly describe the environment. A place can then be learnt and represented through the frequency of the prototypes. This approach is a variant of Bag-of-Words approaches used in the domain of scene classification with the major difference that the different “words” are not taken from the same image but from temporally ordered images. Temporal integration allows us to use Bag-of-Words together with a global characterization of scenes. We evaluate our system with the COLD database. We perform a place recognition task and a place categorization task. Despite its simplicity, thanks to temporal integration of visual cues, our system achieves state-of-the-art performances.
Download

Short Papers
Paper Nr: 18
Title:

BAG OF WORDS FOR LARGE SCALE OBJECT RECOGNITION - Properties and Benchmark

Authors:

Mohamed Aly, Mario Munich and Pietro Perona

Abstract: Object Recognition in a large scale collection of images has become an important application of widespread use. In this setting, the goal is to find the matching image in the collection given a probe image containing the same object. In this work we explore the different possible parameters of the bag of words (BoW) approach in terms of their recognition performance and computational cost. We make the following contributions: 1) we provide a comprehensive benchmark of the two leading methods for BoW: inverted file and min-hash; and 2) we explore the effect of the different parameters on their recognition performance and run time, using four diverse real world datasets.
Download

Paper Nr: 35
Title:

MULTI-CAMERA PEDESTRIAN DETECTION BY MEANS OF TRACK-TO-TRACK FUSION AND CAR2CAR COMMUNICATION

Authors:

Anselm Haselhoff, Lars Hoehmann, Anton Kummert, Christian Nunn, Mirko Meuter and Stefan Müller-Schneiders

Abstract: In this paper a system for fusion of pedestrian detections from multiple vehicles is presented. The application area is narrowed down to driver assistance systems, where single cameras are mounted in the moving vehicles. The main contribution of this paper is a comparison of three fusion algorithms based on real image data. The methods under review include Covariance Fusion, Covariance Intersection, and Covariance Union. An experimental setup is presented, with known ground truth positions of the detected objects. This information can be incorporated for the evaluation of the fusion methods. The system setup consists of two vehicles equipped with LANCOM® wireless access points, cameras, inertial measurement units (IMU) and IMU enhanced GPS receivers. Each vehicle detects pedestrians by means of the camera and an AdaBoost detection algorithm. The results are tracked and transmitted to the other vehicle in appropriate coordinates. Afterwards each vehicle is responsible for reasonable treatment or fusion of the detection data.
Download

Paper Nr: 41
Title:

AN INDIVIDUAL PERSPECTIVE - Perceptually Realistic Depiction of Human Figures

Authors:

Martin Zavesky, Jan Wojdziak, Kerstin Kusch, Daniel Wuttig, Ingmar S. Franke and Rainer Groh

Abstract: Projection of three-dimensional space onto a two-dimensional surface relies on the computer graphics camera based in design on the camera obscura. Geometrical limitations of this model lead to perspective distortions in wide-angle projections. Including the camera model, our approach is to involve the human perception in order to create a realistic spatial impression by a two-dimensional image. The aim is to provide human-centered interfaces for an efficient and coherent communication of spatial information in virtual worlds to support avatar-mediated interaction with its need for correct depiction of human figures concerning proportion and orientation. To this end, we explain an object-based and introduce a camera-based computer graphics procedure to prevent projective distortions and misalignments.
Download

Paper Nr: 46
Title:

PALMPRINT RECOGNITION BASED ON REGIONS SELECTION

Authors:

Salma Ben Jemaa and Mohamed Hammami

Abstract: Palmprint recognition, as a reliable personal identity method, has been received increasing attention and become an area of intense research during recent years. In this paper, we propose a generic biometric system that can be adopted with or without contact depending of the capture system to ensure public security based on identification with palmprint. This system is based on a new global approach which is to focus only on areas of the image having the most discriminating features for recognition. Experimental results have been undertaken on two large databases, namely,“CASIA-Palmprint” and “PolyU-Palmprint” show promising result and demonstrate the effectiveness of the proposed approach.
Download

Paper Nr: 54
Title:

ROBUST FACE DETECTION IN PATIENT TRIAGE IMAGES

Authors:

Niyati Chhaya and Tim Oates

Abstract: Disaster events like the attack on theWorld Trade Center in New York City in 2001 and the earthquake in Haiti in 2010 result in a desperate need for family and friends to obtain information about missing people. This can be facilitated by automatically extracting textual descriptors of patients from images taken at emergency triage centers. We show that existing face detection algorithms, a necessary precursor to extracting such descriptors, perform poorly on triage images taken during disaster simulations, and propose and demonstrate the superiority of an ensemble-based face detection method that is a combination of robust skin detection and simple pattern matching face detection techniques. The addition of a template-based eye detection method further improves the accuracy of our face detection method.
Download

Paper Nr: 75
Title:

ON THE IMPORTANCE OF THE GRID SIZE FOR GENDER RECOGNITION USING FULL BODY STATIC IMAGES

Authors:

Carlos Serra-Toro, V. Javer Traver, Raúl Montoliu and José M. Sotoca

Abstract: In this paper we present an study on the importance of the grid configuration in gender recognition from whole body static images. By using a simple classifier (AdaBoost) and the well-known Histogram of Oriented Gradients features we test several grid configurations. Compared with previous approaches, which use more complicated classifiers or feature extractors, our approach outperforms them in the case of the frontal view recognition and almost equals them in the case of the mixed view (i.e. frontal and back views combined without distinction).
Download

Paper Nr: 109
Title:

WHAT DO YOU MEAN BY “SALIENCE”?

Authors:

V. Javier Traver

Abstract: In the field of computer vision, the term “salience” is being used with different or ambiguous meanings in a variety of contexts. This abuse of terminology contributes to create some confusion or misunderstanding among practitioners in computer vision, a situation which is particularly inconvenient to less experienced researchers in the field. The contribution of this paper is twofold. On the one hand, by providing a categorization of some different usages of the concept of salience, its possible meanings will be clarified. On the other hand, by providing a common framework to understand and conceptualize those different meanings, the commonalities emerge and these analogies might serve to relate the ideas and techniques across usages.
Download

Paper Nr: 114
Title:

EYE STATE ANALYSIS USING IRIS DETECTION TO EXTRACT DRIVER’S MICRO-SLEEP PERIODS

Authors:

Nawal Alioua, Aouatif Amine, Driss Aboutajdine and Mohammed Rziza

Abstract: Eye state analysis is critical step for drowsiness detection. In this paper, we propose a robust algorithm for eye state analysis, which we incorporate into a system for driver’s drowsiness detection to extract micro-sleep periods. The proposed system begins by face extraction using Support Vector Machine (SVM) face detector then a new approach for eye state analysis based on Circular Hough Transform (CHT) is applied on eyes extracted regions. Finally, we proceed to drowsy decision. This new system requires no training data at any step or special cameras. The tests performed to evaluate our proposed driver’s drowsiness detection system using real video sequences acquired by low cost webcam, show that the algorithm provides good results and can work in real-time.
Download

Paper Nr: 125
Title:

EVALUATION OF FEATURES AND COMBINATION APPROACHES FOR THE CLASSIFICATION OF EMOTIONAL SEMANTICS IN IMAGES

Authors:

Ningning Liu, Emmanuel Dellandréa, Liming Chen and Bruno Tellez

Abstract: Recognition of emotional semantics in images is a new and very challenging research direction that gains more and more attention in the research community. As an emerging topic, publications remains relatively rare and numerous issues need to be addressed. In this paper, we propose to investigate the efficiency of different types of features including low-level features and proposed semantic features for classification of emotional semantics in images. Moreover, we propose a new approach that combines different classifiers based on Dempster-Shafer’s theory of evidence, which has the ability to handle ambiguous and uncertain knowledge such as the properties of emotions. Experiments driven on the International Affective Picture System (IAPS) image databases, which is a common stimulus set frequently used in emotion psychology research, demonstrated that the proposed approach can achieve promising results.
Download

Paper Nr: 137
Title:

A PROBABILISTIC FRAMEWORK FOR PATCH BASED VEHICLE TYPE RECOGNITION

Authors:

M. S. Sarfraz and M. H. Khan

Abstract: The performance of automatic vehicle identification systems based on automatic number plate recognition can be improved radically with the addition of vehicle make and model recognition systems. Current approaches recognize make and model when the query image contains a properly segmented vehicle with no background clutter. In this paper, we present a new probabilistic patch based framework to determine make and model of vehicle in presence of background clutter and strong appearance variations e.g. illumination, scale etc. We propose a novel patch selection criterion that automatically learns a discriminative patch set corresponding to vehicle regions for each vehicle class during training phase. In contrast to previous attempts, we intend to recognize make and model of vehicle in highly cluttered background images. Therefore, we have introduced a new challenging dataset of cars with cluttered backgrounds and high in-class appearance variations obtained under non-ideal conditions. Work on proposed approach is in progress and it has shown highly competitive results on segmented car dataset and promising results on introduced dataset of cars.

Paper Nr: 138
Title:

A CORTICAL FRAMEWORK FOR SCENE CATEGORISATION

Authors:

J. M. F. Rodrigues and J. M. H. du Buf

Abstract: Human observers can very rapidly and accurately categorise scenes. This is context or gist vision. In this paper we present a biologically plausible scheme for gist vision which can be integrated into a complete cortical vision architecture. The model is strictly bottom-up, employing state-of-the-art models for feature extractions. It combines five cortical feature sets: multiscale lines and edges and their dominant orientations, the density of multiscale keypoints, the number of consistent multiscale regions, dominant colours in the double-opponent colour channels, and significant saliency in covert attention regions. These feature sets are processed in a hierarchical set of layers with grouping cells, which serve to characterise five image regions: left, right, top, bottom and centre. Final scene classification is obtained by a trained decision tree.
Download

Paper Nr: 156
Title:

A COMPARATIVE STUDY OF DIFFERENT CLASSIFIER SELECTION AND FUSION ALGORITHMS IN A MULTIMODAL IDENTITY VERIFICATION SYSTEM

Authors:

Vahid Sedighi, Mohammad T. Sadeghi and Josef Kittler

Abstract: A comparative study of a set of classifier selection and fusion algorithms for combining face and speech modalities in a multimodal biometric system is performed. Our text independent speaker verification system uses MFCC features along with a GMM method for speaker modelling and the GMM-UBM classifier for matching. The face verification system uses different similarity measures in two appearance based representation spaces namely PCA and LDA. The adopted authentication systems are fused at the matching score level to realise a system which can meet more challenging and varying requirements. The utility of two classifier selection methods, namely the sequential search method and Particle Swarm Optimisation (PSO) algorithm, along with two matching score fusion rules, the weighted averaging rule and SVM, is experimentally studied. In our extensive experimentation on the BANCA database, we demonstrate that by combining the PSO classifier selection method and the SVM fusion rule, the performance of the resulting decision making system is consistently superior.

Paper Nr: 170
Title:

ACTION RECOGNITION BASED ON MULTI-LEVEL REPRESENTATION OF 3D SHAPE

Authors:

Binu M. Nair and Vijayan K. Asari

Abstract: A novel algorithm has been proposed for recognizing human actions using a combination of shape descriptors. Every human action is considered as a 3D space time shape and the shape descriptors based on the 3D Euclidean distance transform and the Radon transform are used for its representation. The Euclidean distance transform gives a suitable internal representation where the interior values correspond to the action performed. By taking the gradient of this distance transform, the space time shape is divided into different number of levels with each level representating a coarser version of the original space time shape. Then, at each level and at each frame, the Radon transform is applied from where action features are extracted. The action features are the R-Transform feature set which gives the posture variations of the human body with respect to time and the R-Translation vector set which gives the translatory variation. The combination of these action features is used in a nearest neighbour classifier for action recognition.
Download

Paper Nr: 177
Title:

THRESHOLD CORRECTION OF DOCUMENT IMAGE BINARIZATION FOR TEXT EXTRACTION

Authors:

Hiroshi Tanaka, Yusaku Fujii and Yoshinobu Hotta

Abstract: In this paper, a simple threshold correction method for document image binarization for text extraction is presented. This method enhances the binary image of characters, which is often adversely influenced by neighboring strong pixels or background noise. The threshold correction method is based on a similar method applied to ruled-line extraction presented by the author, and is claimed to be effective to text extraction. The author also reveals the relationship between effectiveness of the method and the image resolution.
Download

Paper Nr: 10
Title:

A NEW OBJECT RECOGNITION SYSTEM

Authors:

Nikolai Sergeev and Guenther Palm

Abstract: This paper presents a new 2D object recognition system. The object representation used by the system is rotation, translation, scaling and reflection invariant. The system is highly robust to partial occlusion, deformation and perspective change. The last makes it applicable to 3D tasks. Color information can be ignored as well as combined with form representation. The boundary of an object to be recognized doesn’t need to be path-connected. The time demand to learn a new object doesn’t depend on the number of objects already learned. No object segmentation prior to recognition is needed. To evaluate the system the 3D object library COIL-100 was used.
Download

Paper Nr: 60
Title:

NEW OPTICAL ILLUSION BY ANIMATING JUDD ILLUSION USING SCALABLE VECTOR GRAPHICS

Authors:

Teluhiko Hilano and Kazuhisa Yanaka

Abstract: We present a method of creating a new illusion with motion accomplished by gradually changing the parameters of an existing still-image illusion . We applied this method to the well-known Judd Illusion and the angle of the tail of the arrow was gradually changed with SVG. We therefore found a new animated optical illusion in which volunteers perceived that the main segment was moving horizontally, although it actually remained stationary.
Download

Paper Nr: 66
Title:

AUTOMATIC SHAKE TO ENHANCE FRASER-WILCOX ILLUSIONS

Authors:

Kazuhisa Yanaka, Ryuto Mitsuhashi and Teluhiko Hilano

Abstract: The Fraser-Wilcox illusion, which is an optical illusion first found by Fraser and Wilcox in 1979 and later classified into the peripheral drift illusion that was presented in 1997, is the illusion that a disk drawn on a still image looks as if it is rotating. Recently, Kitaoka proposed an optimized Fraser-Wilcox illusion type V in which a stronger illusion can be perceived. This proposal has attracted a great deal of attention, but not everyone can see the illusion. It is well known that the effect of some existing illusions is reinforced by shaking the image by hand, and we therefore developed a system in which a still image displayed on the screen of an ordinary PC can be shaken automatically by using our software. Experimental results demonstrated that the strength of some types of Fraser-Wilcox illusions can be enhanced considerably by using the proposed system.
Download

Paper Nr: 89
Title:

REAL TIME FALL DETECTION AND POSE RECOGNITION IN HOME ENVIRONMENTS

Authors:

Jerry Aertssen, Maja Rudinac and Pieter P. Jonker

Abstract: Falls are one of the major obstacles for independent living of elderly people that can be severally reduced introducing home monitoring systems that will raise the alarm in the case of emergency. In this paper we present an inexpensive and fast system for fall detection and dangerous actions monitoring in home environments. Our system is equipped only with a single camera placed on the ceiling and it performs room monitoring based on the motion information. After background subtraction, motion information is extracted using the method of Motion History Images and analysed to detect important actions. We propose to model actions as the shape deformations of motion history image in time. Every action is defined with the specific shape parameters taken at several moments in time. Model shapes are extracted in offline analysis and used for comparison in room monitoring. For testing, we designed a special room in which we monitored in various environmental conditions a total of four different actions that are dangerous for elderly people: “walking”, “falling”, “bending” and “collapsing”. Obtained results show that our system can detect dangerous actions in real time with high recognition rates and achieves better performance comparing to the state of the art methods that use similar techniques. Results encourage us to implement and test this system in real hospital environments.
Download

Paper Nr: 96
Title:

CAMERA LOCALIZATION USING INCOMPLETE CHESSBOARD PATTERN

Authors:

Marek Solony, Pavel Zak, Vitezslav Beran and Michal Spanel

Abstract: This paper introduces the approach for the real-time camera localization by capturing the plane of chessboard pattern. This task has been already solved by several different approaches, but we present the novel method of the chessboard reconstruction from its incomplete image, that enables successful camera localization even if the captured chessboard plane is partially covered by an unknown object. The camera position and orientation is during the processing of the videosequence tracked with the Kalman filter that enables correct localization also in the closeup views on the pattern.
Download

Paper Nr: 107
Title:

IMPROVING GEOMETRIC HASHING BY MEANS OF FEATURE DESCRIPTORS

Authors:

Federico Tombari and Luigi Di Stefano

Abstract: Geometric Hashing is a well-known technique for object recognition. This paper proposes a novel method aimed at improving the performance of Geometric Hashing in terms of robustness toward occlusion and clutter. To this purpose, it employs feature descriptors to notably decrease the amount of false positives that generally arise under these conditions. An additional advantage of the proposed technique with respect to the original method is the reduction of the computation requirements, which becomes significant with increasing number of features.
Download

Paper Nr: 115
Title:

ALGORITHMS FOR BINARIZING, ALIGNING AND RECOGNITION OF FINGERPRINTS

Authors:

A. Pillai, S. Mil'shtein and M. Baier

Abstract: Minutiae based algorithms are widely used today for fingerprint authentication. In this study, we report the use of the Fast Fourier Transform (FFT) as a base principle for our recognition method, and have also developed image normalization methods. We also developed a novel method to align fingerprints to a common reference orientation based on the Fourier Mellin Transform. Two methods for image recognition are described. The first method uses image subtraction techniques in conjunction with a thresholding scheme. The second method, which is currently in development, utilizes multiple neural networks running in parallel. This technique is expected to be able to run image comparisons on large databases in real-time through the use of modern parallel processing technology. In this study we analyzed 720 fingerprints generated by wet-ink, flat digital scanners, and by a novel touch less fingerprinting scanner. For the image subtraction method comparing high quality fingerprints (prints taken in touch less way), the rate of success is 97%. For poorer quality prints, (those taken with wet-ink) the rate of success dropped to 93%. Recognition statistics are not currently available for the neural network based image recognition method as it is currently in development.
Download

Paper Nr: 134
Title:

TEXT DETECTION IN STILL IMAGES AND CCTV VIDEOS USING LOCAL ENERGY BASED SHAPE HISTOGRAM FEATURES (LESH)

Authors:

M. Fraz and M. S. Sarfraz

Abstract: Text data present in images and videos provide highly useful information for understanding image content, automatic annotation, indexing, and structuring of images and videos. In this paper, we present a method for detection and extraction of text from natural scene still images and CCTV videos by using local energy based shape histogram features. A set of 128 local energy based shape histogram features (LESH) is capable of enhancing various type of text information in grey level natural scene images. We make use of simple classification strategy to enhance detection efficiency with less execution time. The major aim of deploying a simple classification strategy is to make this system capable of working not only for images but for real time videos as well. The paper also presents a novel text dataset that is compiled to train and test the detection system. The database contains more than 4200 text and non text samples of different orientation to support research in text related applications.

Paper Nr: 139
Title:

THE EXTENDED BOYER-MOORE-HORSPOOL ALGORITHM FOR LOCALITY-SENSITIVE PSEUDO-CODE

Authors:

Kengo Terasawa, Toshio Kawashima and Yuzuru Tanaka

Abstract: Boyer-Moore-Horspool (BMH) algorithm is known as a very efficient algorithm that finds a place where a certain string specified by the user appears within a longer text string. In this study, we propose the Extended Boyer-Moore-Horspool algorithm that can retrieve a pattern in the sequence of real vectors, rather than in the sequence of the characters. We reproduced the BMH algorithm to the sequence of real vectors by transforming the vectors into pseudo-code expression that consists of multiple integers and by introducing a novel binary relation called ‘semiequivalent.’ We confirmed the practical utility of our algorithm by applying it to the string matching problem of the images from “Minutes of the Imperial Diet,” to which optical character recognition does not work well.
Download

Paper Nr: 179
Title:

LARGE-SCALE-INVARIANT TEXTURE RECOGNITION

Authors:

Muhammad Rushdi and Jeffrey Ho

Abstract: This paper addresses the problem of texture recognition across large scale variations. Most of the existing methods for texture recognition handle only small-scale variations in test images. We propose using microscopic-scale textures to classify texture images at any coarser scale without prior knowledge of the relative scale. In particular, given a test camera image, we compute the average error of approximating the test texture with patches of the microscopic texture for certain category and scaling factor. Recognition is made by selecting the category with the minimum average error over all categories and scaling factors. Experiments on camera and low-magnification microscopic images show the validity of the proposed method.
Download

Area 3 - Motion, Tracking and Stereo Vision

Full Papers
Paper Nr: 13
Title:

ROBUST AND UNBIASED FOREGROUND / BACKGROUND ENERGY FOR MULTI-VIEW STEREO

Authors:

Zhihu Chen and Kwan-Yee K. Wong

Abstract: This paper revisits the graph-cuts based approach for solving the multi-view stereo problem, and proposes a novel foreground / background energy which is shown to be unbiased and robust against noisy depth maps. Unlike most existing works which focus on deriving a robust photo-consistency energy, this paper targets at deriving a robust and unbiased foreground / background energy. By introducing a novel data-dependent foreground / background energy, we show that it is possible to recover the object surface from noisy depth maps even in the absence of the photo-consistency energy. This demonstrates that the foreground / background energy is equally important as the photo-consistency energy in graph-cuts based methods. Experiments on real data sequences further show that high quality reconstructions can be achieved using our proposed foreground / background energy with a very simple photo-consistency energy.
Download

Paper Nr: 20
Title:

REAL-TIME 3D MODELING OF VEHICLES IN LOW-COST MONOCAMERA SYSTEMS

Authors:

M. Nieto, L. Unzueta, A. Cortés, J. Barandiaran, O. Otaegui and P. Sánchez

Abstract: A new method for 3D vehicle modeling in low-cost monocamera surveillance systems is introduced in this paper. The proposed algorithm aims to resolve the projective ambiguity of 2D image observations by means of the integration of temporal information and model priors within a Markov Chain Monte Carlo (MCMC) method. The method is specially designed to work in challenging scenarios, with noisy and blurred 2D observations, where traditional edge-fitting or feature-based methods fail. Tests have shown excellent estimation results for traffic-flow video surveillance applications, that can be used to classify vehicles according to their length, width and height.
Download

Paper Nr: 22
Title:

BIOINFORMATICS INSPIRED ALGORITHM FOR STEREO CORRESPONDENCE

Authors:

Romain Dieny, Jerome Thevenon, Jesus Martinez-del-Rincon and Jean-Christophe Nebel

Abstract: In this paper, we exploit the analogy between protein sequence alignment and image pair correspondence to design a bioinformatics-inspired framework for stereo matching based on dynamic programming. This approach also led to the creation of a meaningfulness graph, which helps to predict matching validity according to image overlap and pixel similarity. Finally, we propose an automatic procedure to estimate automatically all matching parameters. This work is evaluated qualitatively and quantitatively using a standard benchmarking dataset and by conducting stereo matching experiments between images captured at different resolutions. Results confirm the validity of the computer vision/bioinformatics analogy to develop a versatile and accurate low complexity stereo matching algorithm.
Download

Paper Nr: 39
Title:

MULTI-CAMERA PEOPLE TRACKING WITH HIERARCHICAL LIKELIHOOD GRIDS

Authors:

Lili Chen, Giorgio Panin and Alois Knoll

Abstract: In this paper, we present a grid-based tracking by detection methodology, applied to 3D people tracking for multi-camera video surveillance. In particular, frame-by-frame detection is performed by means of hierarchical likelihood grids, using edge matching through the oriented distance transform on each camera view and a simple person model, followed by likelihood grids clustering in state-space. Subsequently, the tracking module performs a global nearest neighbor data association, in order to initiate, maintain and terminate tracks automatically. The proposed system can easily include additional features, such as color or background subtraction, it can be scaled to more camera views, and it can be used to track other items as well. We demonstrate it through experiments in indoor sequences, using a calibrated multi-camera setup.
Download

Paper Nr: 110
Title:

NEW DYNAMIC ESTIMATION OF DEPTH FROM FOCUS IN ACTIVE VISION SYSTEMS - Data Acquisition, LPV Observer Design, Analysis and Test

Authors:

Tiago Gaspar and Paulo Oliveira

Abstract: In this paper, new methodologies for the estimation of the depth of a generic moving target with unknown dimensions, based upon depth from focus strategies, are proposed. A set of measurements, extracted from real time images acquired with a single pan and tilt camera, is used. These measurements are obtained resorting to the minimization of a new functional, deeply rooted on optical characteristics of the lens system, and combined with additional information extracted from images to provide estimates for the depth of the target. This integration is performed by a Linear Parameter Varying (LPV) observer, whose syntesis and analysis are also detailed. To assess the performance of the proposed system, a series of indoor experimental tests, with a real target mounted on a robotic platform, for a range of operation of up to ten meter, were carried out. A centimetric accuracy was obtained under realistic conditions.
Download

Paper Nr: 124
Title:

LARGE SCALE LOCALIZATION - For Mobile Outdoor Augmented Reality Applications

Authors:

I. M. Zendjebil, F. Ababsa, J-Y. Didier and M. Mallem

Abstract: In this paper, we present an original localization system for large scale outdoor environments which uses a markerless vision-based approach to estimate the camera pose. It relies on natural feature points extracted from images. Since this type of method is sensitive to brightness changes, occlusions and sudden motions which are likely to occur in outdoor environment, we use two more sensors to assist the vision process. In our work, we would like to demonstrate the feasibility of an assistance scheme in large scale outdoor environment. The intent is to provide a fallback system for the vision in case of failure as well as to reinitialize the vision system when needed. The complete localization system aims to be autonomous and adaptable to different situations. We present here an overview of our system, its performance and some results obtained from experiments performed in an outdoor environment under real conditions.
Download

Paper Nr: 133
Title:

REAL-TIME IMAGE BASED VISUAL SERVOING ARCHITECTURE FOR MANIPULATOR ROBOTS

Authors:

Adrian Burlacu, Copot Cosmin, Andrei Panainte, Carlos Pascal and Corneliu Lazar

Abstract: The necessity of designing flexible and versatile systems is one of the most current trends in robotic research. Including visual servoing techniques in an existing robotic system is a very challenging task. In this paper a solution for extending the capabilities of a 6 d.o.f manipulator robot, for visual servoing system development, is presented. An image-based control architecture is designed and a real-time implementation on an ABB robot is developed. The image acquisition and processing, toghether with the computing of the image-based control law were implemented in Matlab. A new type of robot driving interface that links the robots’ controller with Matlab environment is proposed. The robustness and stability of the feature point based control laws are tested in multiple experiments. Experimental results revealed very good performances for the real-time visual servoing system.
Download

Paper Nr: 135
Title:

MULTI-MODAL PERSON DETECTION AND TRACKING FROM A MOBILE ROBOT IN A CROWDED ENVIRONMENT

Authors:

A. A. Mekonnen, F. Lerasle and I. Zuriarrain

Abstract: This paper addresses multi-modal person detection and tracking using a 2D SICK Laser Range Finder and a visual camera from a mobile robot in a crowded and cluttered environment. A sequential approach in which the laser data is segmented to filter human leg like structures to generate person hypothesis which are further refined by a state of the art parts based visual person detector for final detection, is proposed. Based on this detection routine, a Monte Carlo Markov Chain (MCMC) particle filtering strategy is utilized to track multiple persons around the robot. Integration of the implemented multi-modal person detector and tracker in our robotic platform and associated experiments are presented. Results obtained from all tests carried out have been clearly reported proving the multi-modal approach outperforms its single sensor counterparts taking detection, subsequent use, computation time, and precision into account. The work presented here will be used to define navigational control laws for passer-by avoidance during a service robot’s person following activity.
Download

Paper Nr: 145
Title:

FAST LEARNABLE OBJECT TRACKING AND DETECTION IN HIGH-RESOLUTION OMNIDIRECTIONAL IMAGES

Authors:

David Hurych, Karel Zimmermann and Tomáš Svoboda

Abstract: This paper addresses object detection and tracking in high-resolution omnidirectional images. The foreseen application is a visual subsystem of a rescue robot equipped with an omnidirectional camera, which demands real time efficiency and robustness against changing viewpoint. Object detectors typically do not guarantee specific frame rate. The detection time may vastly depend on a scene complexity and image resolution. The adapted tracker can often help to overcome the situation, where the appearance of the object is far from the training set. On the other hand, once a tracker is lost, it almost never finds the object again. We propose a combined solution where a very efficient tracker (based on sequential linear predictors) incrementally accommodates varying appearance and speeds up the whole process. We experimentally show that the performance of the combined algorithm, measured by a ratio between false positives and false negatives, outperforms both individual algorithms. The tracker allows to run the expensive detector only sparsely enabling the combined solution to run in real-time on 12 MPx images from a high resolution omnidirectional camera (Ladybug3).
Download

Paper Nr: 146
Title:

PARTICLE SMOOTHING FOR SOLVING AMBIGUITY PROBLEMS IN ONE-SHOT STRUCTURED LIGHT SYSTEMS

Authors:

F. van der Heijden, F. F. Berendsen, L. J. Spreeuwers and E. Schippers

Abstract: One-shot structured light systems for 3D depth reconstruction often use a periodic illumination pattern. Finding corresponding points in the image and projector plane, needed for a triangulation, boils down to phase estimation. The 2πN ambiguities in the phase cause ambiguities in the reconstructed depth. This paper solves these ambiguities by constraining the solution space to scenes that only contain objects with flat surfaces, i.e. polyhedrons. We develop a new particle filter that estimates the depth and solves the ambiguity problem. A state model is proposed for piecewise continuous signals. This state model is worked out to find the optimal proposal density of the particle filter. The approach is validated with a demonstration.
Download

Paper Nr: 159
Title:

SCALABLE OPTICAL TRACKING - A Practical Low-cost Solution for Large Virtual Environments

Authors:

Steven Maesen and Philippe Bekaert

Abstract: Navigation in large virtual reality applications is often done by unnatural input devices like keyboard, mouse, gamepad and similar devices. A more natural approach would be letting the user walk through the virtual world as if it was a physical place. This involves tracking the position and orientation of the participant over a large area. We propose a pure optical tracking system that only uses off-the-shelf components like cameras and LED ropes. The construction of the scene doesn’t require any off-line calibration or difficult positioning, which makes it easy to build and indefinitely scalable in both size and users. The proposed algorithms have been implemented and tested in a virtual and a room-sized lab set-up. The first results from our tracker are promising and can compete with many (expensive) commercial trackers.
Download

Short Papers
Paper Nr: 12
Title:

UNSUPERVISED LEARNING FOR TEMPORAL SEARCH SPACE REDUCTION IN THREE-DIMENSIONAL SCENE RECOVERY

Authors:

Tom Warsop and Sameer Singh

Abstract: Methods for three-dimensional scene recovery traverse scene spaces (typically along epipolar lines) to compute two-dimensional image feature correspondences. These methods ignore potentially useful temporal information presented by previously processed frames, which can be used to decrease search space traversal. In this work, we present a general framework which models relationships between image information and recovered scene information specifically for the purpose of improving efficiency of three-dimensional scene recovery. We further present three different methods implementing this framework using either a naive Nearest Neighbour approach or a more sophisticated collection of associated Gaussians. Whilst all three methods provide a decrease in search space traversal, it is the Gaussian-based method which performs best, as the other methods are subject to the (demonstrated) unwanted behaviours of convergence and oscillation.
Download

Paper Nr: 34
Title:

HAND GESTURE RECOGNITION THROUGH ON-LINE SKELETONIZATION - Application of Continuous Skeleton to Real-time Shape Analysis

Authors:

Alexey Kurakin and Leonid Mestetskiy

Abstract: New method for palm shape analysis and hand gesture recognition with help of continuous skeletons is presented in the paper. Continuous skeleton makes possible to develop fast and simple methods for palm shape analysis and measure a lot of its features. In particular it is possible to develop efficient methods for segmentation and analysis of the object topological structure, measuring relative location of object parts and measuring width of the object in arbitrary place. Applying to palm shape analysis skeleton provides a way to determine number of visible fingers, estimate their thickness and location, and perform efficient palm shape comparison with the sample. Moreover proposed approach allows measuring all mentioned features regardless of palm orientation in the frame. And efficient algorithms for skeleton construction allow performing shape analysis with high speed in real-time applications.
Download

Paper Nr: 40
Title:

DETECTING AND TRACKING PEOPLE IN MOTION - A Hybrid Approach Combining 3D Reconstruction and 2D Description

Authors:

Peter Holzer, Chunming Li and Axel Pinz

Abstract: We analyze the most difficult case of visual surveillance, when people in motion are observed by a moving camera. Our solution to this problem is a hybrid system that combines the online 3D reconstruction of stationary background structure, camera trajectory, and moving foreground objects with more established techniques in the 2D domain. Once this 3D part has succeeded in focusing the attention on a particular, moving foreground object, we continue in the 2D image domain using a state-of-the art shape-based person detector, and meanshift-based object tracking. Our results show various benefits of this hybrid approach beyond improved detection rate and reduced false alarms. In particular, each individual algorithmic component can benefit from the results of the other components, by gathering a richer foreground description, improved self-diagnosis capabilities, and by an explicit use of the available 3D information.
Download

Paper Nr: 42
Title:

ROBUST MOBILE OBJECT TRACKING BASED ON MULTIPLE FEATURE SIMILARITY AND TRAJECTORY FILTERING

Authors:

Duc Phu Chau, François Bremond, Monique Thonnat and Etienne Corvee

Abstract: This paper presents a new algorithm to track mobile objects in different scene conditions. The main idea of the proposed tracker includes estimation, multi-features similarity measures and trajectory filtering. A feature set (distance, area, shape ratio, color histogram) is defined for each tracked object to search for the best matching object. Its best matching object and its state estimated by the Kalman filter are combined to update position and size of the tracked object. However, the mobile object trajectories are usually fragmented because of occlusions and misdetections. Therefore, we also propose a trajectory filtering, named global tracker, aims at removing the noisy trajectories and fusing the fragmented trajectories belonging to a same mobile object. The method has been tested with five videos of different scene conditions. Three of them are provided by the ETISEO benchmarking project (http://www-sop.inria.fr/orion/ETISEO) in which the proposed tracker performance has been compared with other seven tracking algorithms. The advantages of our approach over the existing state of the art ones are: (i) no prior knowledge information is required (e.g. no calibration and no contextual models are needed), (ii) the tracker is more reliable by combining multiple feature similarities, (iii) the tracker can perform in different scene conditions: single/several mobile objects, weak/strong illumination, indoor/outdoor scenes, (iv) a trajectory filtering is defined and applied to improve the tracker performance, (v) the tracker performance outperforms many algorithms of the state of the art. filtering is defined and applied to improve the tracker performance, (v) the tracker performance outperforms many algorithms of the state of the art.
Download

Paper Nr: 63
Title:

OPTIMIZATION OF BACKGROUND SUBTRACTION PARAMETERS USING BIG BANG OPTIMIZATION

Authors:

Jai Prakash, Shivesh Bajpai and Subrata Bhattacharya

Abstract: Background Subtraction is a widely used method for Motion Detection. The quality of output however depends largely on proper initialization of the background subtraction parameters. We use Big Bang Optimization technique to optimize these parameters in order to minimize the errors. The Big Bang Optimization has also been compared to Particle Swarm Optimization. It is found that the former establishes a lower error level and achieves the same much quickly when compared to the latter.

Paper Nr: 69
Title:

FAST REAL-TIME SEGMENTATION AND TRACKING OF MULTIPLE SUBJECTS BY TIME-OF-FLIGHT CAMERA - A New Approach for Real-time Multimedia Applications with 3D Camera Sensor

Authors:

Piercarlo Dondi and Luca Lombardi

Abstract: Time-of-Flight cameras are a new kind of sensors that use near-infrared light to provide distance measures of an environment. In this paper we present a very fast method for real-time segmentation and tracking, that exploits the peculiar characteristics of these devices. The foreground segmentation is achieved by a dynamic thresholding and region growing: an appropriate correction based on flexible intensity thresholding and mathematical morphology is used to partially compensate one of the most common problem of the TOF cameras, the noise generated by sun light. By the use of a Kalman filter for tracking the retrieved objects the system is able to correctly handle the occlusions and to follow multiple objects placed at different distances. The proposed system is our basic step for complex multimedia applications, such as augmented reality. An example of mixed reality that includes the integration of color information, supplied by a webcam is shown in the experimental results.
Download

Paper Nr: 93
Title:

MEAN SHIFT OBJECT TRACKING USING A 4D KERNEL AND LINEAR PREDICTION

Authors:

Katharina Quast, Christof Kobylko and André Kaup

Abstract: A new mean shift tracker which tracks not only the position but also the size and orientation of an object is presented. By using a four-dimensional kernel, the mean shift iterations are performed in a four-dimensional search space consisting of the image coordinates, a scale and an orientation dimension. Thus, the enhanced mean shift tracker tracks the position, size and orientation of an object simultaneously. To increase the tracking performance by using the information about the position, size and orientation of the object in the previous frames, a linear prediction is also integrated into the 4D kernel tracker. The tracking performance is further improved by considering the gradient norm as an additional object feature.
Download

Paper Nr: 98
Title:

IDENTIFICATION AND RECONSTRUCTION OF COMPLETE GAIT CYCLES FOR PERSON IDENTIFICATION IN CROWDED SCENES

Authors:

Martin Hofmann, Daniel Wolf and Gerhard Rigoll

Abstract: This paper addresses the problem of gait recognition in the presence of occlusions. Recognition of people using their gait has been an active research area and many successful algorithms have been presented. However to this point non of the methods addresses the problem of occlusion. Most of the current algorithms need a full gait cycle for recognition. In this paper we present a scheme for reconstruction of full gait cycles, which can be used as preprocessing step for any state-of-the-art gait recognition method. We test this on the TUM-IITKGP gait recognition database and show a significant performance gain in the case of occlusions.
Download

Paper Nr: 101
Title:

COMBINATION OF CORRELATION MEASURES FOR DENSE STEREO MATCHING

Authors:

Sylvie Chambon and Alain Crouzil

Abstract: In the context of dense stereo matching of pixels, we study the combination of different correlation measures. Considering the previous work about correlation measures, we use some measures that are the most significant in five kinds of measures based on: cross-correlation, classic statistics, image derivatives, nonparametric statistics and robust statistics. More precisely, this study validates the possible improvement of stereo-matching by combining complementary correlation measures and it also highlights the two measures that can be combined in order to take advantage of the different methods: Gradient Correlation measure (GC) and Smooth Median Absolute Deviation measure (SMAD). Finally, we introduce an algorithm of fusion that allows to combine automatically correlation measures.
Download

Paper Nr: 103
Title:

OBJECT TRACKING BASED ON PARTICLE FILTERING WITH MULTIPLE APPEARANCE MODELS

Authors:

Nicolas Widynski, Emanuel Aldea, Séverine Dubuisson and Isabelle Bloch

Abstract: In this paper, we propose a novel method to track an object whose appearance is evolving in time. The tracking procedure is performed by a particle filter algorithm in which all possible appearance models are explicitly considered using a mixture decomposition of the likelihood. Then, the component weights of this mixture are conditioned by both the state and the current observation. Moreover, the use of the current observation makes the estimation process more robust and allows handling complementary features, such as color and shape information. In the proposed approach, these estimated component weights are computed using a Support Vector Machine. Tests on a mouth tracking problem show that the multiple appearance model outperforms classical single appearance likelihood.
Download

Paper Nr: 117
Title:

REAL-TIME LOCALIZATION OF AN UNMANNED GROUND VEHICLE USING A 360 DEGREE RANGE SENSOR

Authors:

Soon-Yong Park and Sung-In Choi

Abstract: A computer vision technique for the localization of an unmanned ground vihicle (UGV) is presented. The proposed technique is based on 3D registration of a sequence of 360 degree range data and a digital surface model (DSM). 3D registration of a sequence of dense range data requires a large computation time. For real time localization, we propose projection-based registration and uniform arc length sampling (UALS) techniques. UALS reduces the number of 3D sample points while maintaining their uniformity over range data in terms of ground sample distance. The projection-based registration technique reduces the time of 3D correspondence search. Experimental results from two real navigation paths are shown to verify the performance of the proposed method.
Download

Paper Nr: 118
Title:

FACIAL POSE AND ACTION TRACKING USING SIFT

Authors:

B. H. Pawan Prasad and R. Aravind

Abstract: In this paper, a robust method to estimate the head pose and facial actions in uncalibrated monocular video sequences is described. We do not assume the knowledge of the camera parameters unlike most other methods. The face is modelled in 3D using the Candide-3 face model. A simple graphical user interface is designed to initialize the tracking algorithm. Tracking of facial feature points is achieved using a novel SIFT-based point tracking algorithm. The head pose is estimated using the POSIT algorithm in a RANSAC framework. The animation parameter vector is computed in an optimization procedure. The proposed algorithm is tested on two standard data sets. The qualitative and quantitative analysis is similar to the analysis of competing methods reported in literature. Experimental results validates that, the proposed system accurately estimates the pose and the facial actions. The proposed system can also be used for facial expression classification and facial animation.
Download

Paper Nr: 119
Title:

HEAD DETECTION IN STEREO DATA FOR PEOPLE COUNTING AND SEGMENTATION

Authors:

Tim van Oosterhout, Sander Bakkes and Ben Kröse

Abstract: In this paper we propose a head detection method using range data from a stereo camera. The method is based on a technique that has been introduced in the domain of voxel data. For application in stereo cameras, the technique is extended (1) to be applicable to stereo data, and (2) to be robust with regard to noise and variation in environmental settings. The method consists of foreground selection, head detection, and blob separation, and, to improve results in case of misdetections, incorporates a means for people tracking. It is tested in experiments with actual stereo data, gathered from three distinct real-life scenarios. Experimental results show that the proposed method performs well in terms of both precision and recall. In addition, the method was shown to perform well in highly crowded situations. From our results, we may conclude that the proposed method provides a strong basis for head detection in applications that utilise stereo cameras.
Download

Paper Nr: 126
Title:

A FRAMEWORK FOR WEBCAM-BASED HAND REHABILITATION EXERCISES

Authors:

Rui Liu, Burkhard C. Wünsche, Christof Lutteroth and Patrice Delmas

Abstract: Applications for home-based care are rapidly increasing in importance due to spiraling health and elderly care costs. An important aspect of home-based care is exercises for rehabilitation and improving general health. In this paper we present a framework for demonstrating and monitoring hand exercises. The three main components are a 3D hand model, a high-level animation framework which facilitates the task of specifying hand exercises via skeletal animation, and a hand tracking program to monitor and evaluate users’ performance. Our hand tracking solution has no calibration stage and is easily set-up. Segmentation is performed using a perception-based colour space, and hand tracking and motion estimate are obtained using novel variations to a CAMSHIFT and contour analysis algorithms. The results indicate that the robust tracking along with the demonstration and reconstruction of hand exercises provide an effective platform for hand rehabilitation.
Download

Paper Nr: 143
Title:

ROADGUARD - Highway Control and Management System

Authors:

Salma Kammoun Jarraya, Adam Ghorbel, Ahmed Chaouachi and Mohamed Hammami

Abstract: In this paper, we propose a new approach, called RoadGuard, for Highway Control and Management System. RoadGuard is based on counting and tracking moving vehicles robustly. Our system copes with some challenges related to such application processing steps like shadow, ghost and occlusion. A new algorithm is proposed to detect and remove cast shadow. The occlusion and ghost problems are resolved by the adopted tracking technique. A comparative study by quantitative evaluations shows that the proposed approach can detect vehicles robustly and accurately from highway videos recorded by a static camera which include several constraints. In fact, our system has the ability to control highway by detecting strange events that can happen like sudden stopped vehicles in roads, parked vehicles in emergency zones or even illegal conduct such going out from the road. Moreover, RoadGuard is capable to manage highways by saving information about date and time of overloaded roads.
Download

Paper Nr: 161
Title:

VIGNETTING CORRECTION FOR PAN-TILT SURVEILLANCE CAMERAS

Authors:

Ricardo Galego, Alexandre Bernardino and José Gaspar

Abstract: It is a well know result that the geometry of pan and tilt (perspective) cameras auto-calibrate using just the image information. However, applications based on panoramic background representations must also compensate for radiometric effects due to camera motion. In this paper we propose a methodology for calibrating the radiometric effects inherent in the operation of pan-tilt cameras, with applications to visual surveillance in a cube (mosaicked) visual field representation. The radiometric calibration is based on the estimation of vignetting image distortion using the pan and tilt degrees of freedom instead of color calibrating patterns. Experiments with real images show that radiometric calibration reduce the variance in the background representation allowing for more effective event detection in background-subtraction-based algorithms.
Download

Paper Nr: 174
Title:

BIOLOGICALLY INSPIRED ROBOT NAVIGATION BY EXPLOITING OPTICAL FLOW PATTERNS

Authors:

Sotirios Ch. Diamantas, Anastasios Oikonomidis and Richard M. Crowder

Abstract: In this paper a novel biologically inspired method is addressed for the robot homing problem where a robot returns to its home position after having explored an a priori unknown environment. The method exploits the optical flow patterns of the landmarks and based on a training data set a probability is inferred between the current snapshot and the snapshots stored in memory. Optical flow, which is not a property of landmarks like color, shape, and size but a property of the camera motion, is used for navigating a robot back to its home position. In addition, optical flow is the only information provided to the system while parameters like position and velocity of the robot are not known. Our method proves to be effective even when the snapshots of the landmarks have been taken from varying distances and velocities.
Download

Paper Nr: 175
Title:

A NEW DEPTH-BASED FUNCTION FOR 3D HAND MOTION TRACKING

Authors:

Ouissem Ben-Henia and Saida Bouakaz

Abstract: Model-based methods to the tracking of an articulated hand in a video sequence generally use a cost function to compare the hand pose with a parametric three-dimensional (3D) hand model. This comparison allows adapting the hand model parameters and it is thus possible to reproduce the hand gestures. Many proposed cost functions exploit either silhouette or edge features. Unfortunately, these functions cannot deal with the tracking of complex hand motion. This paper presents a new depth-based function to track complex hand motion such as opening and closing hand. Our proposed function compares 3D point clouds stemming from depth maps. Each hand point cloud is compared with several clouds of points which correspond to different model poses in order to obtain the model pose that is close to the hand one. To reduce the computational burden, we propose to compute a volume of voxels from a hand point cloud, where each voxel is characterized by its distance to that cloud. When we place a model point cloud inside this volume of voxels, it becomes fast to compute its distance to the hand point cloud. Compared with other well-known functions such as the directed Hausdorff distance (Huttenlocher et al., 1993), our proposed function is more adapted to the hand tracking problem and it is faster than the Hausdorff function.
Download

Paper Nr: 32
Title:

COLOR PHOTOMETRIC STEREO WITH SHAPE ENHANCING FILTERING

Authors:

Osamu Ikeda

Abstract: In the shape reconstruction from images, inter-reflections between object parts are unavoidable, and the objects of interest may often have colors on them. The former may blur images even for objects with diffusive reflection characteristics, resulting in blurred reconstructed shape. In this paper, first, the shape reconstruction from the surface tilts using the Fourier transform is extended to color images. Then, a space-invariant filter is introduced in the spatial frequency domain to compensate for the blurred spectrum. In its implementation, the images are modified in gamma and the two parameters of the spectral enhancing filter are adusted so that the depths of the resulting shape agree with the measured ones obtained with the stereopsis. The method is examined in experiments.

Paper Nr: 55
Title:

STEREO VISION HEAD VERGENCE USING GPU CEPSTRAL FILTERING

Authors:

Luis Almeida, Paulo Menezes and Jorge Dias

Abstract: Vergence ability is an important visual behavior observed on living creatures when they use vision to interact with the environment. The notion of active observer is equally useful for robotic vision systems on tasks like object tracking, fixation and 3D environment structure recovery. Humanoid robotics are a potential playground for such behaviors. This paper describes the implementation of a real time binocular vergence behavior using cepstral filtering to estimate stereo disparities. By implementing the cepstral filter on a graphics processing unit (GPU) using Compute Unified Device Architecture (CUDA) we demonstrate that robust parallel algorithms that used to require dedicated hardware are now available on common computers. The cepstral filtering algorithm speed up is more than sixteen times than on a current CPU. The overall system is implemented in the binocular vision system IMPEP (IMPEP Integrated Multimodal Perception Experimental Platform) to illustrate the system performance experimentally.
Download

Paper Nr: 81
Title:

GENERATION OF FACIAL IMAGE SAMPLES FOR BOOSTING THE PERFORMANCE OF FACE RECOGNITION SYSTEMS

Authors:

Zhibo Ni and C. H. Leung

Abstract: We tackle the problem of insufficient training samples which often leads to degraded performance for face recognition systems. First, we propose an efficient method for matching two facial images that does not require 3D information. We then apply the proposed face matching algorithm to morph a source image into a target image, thereby generating a large number of facial images with expressions or lighting conditions in-between that of the source and target images. These generated images are used to greatly expand the set of training samples in a face recognition system. Experiments show that by incorporating these large number of generated facial images in the training process, the recognition rate for test samples is boosted up by a large margin.
Download

Paper Nr: 99
Title:

3D OPTICAL FLOW FROM DOPPLER AND WINDPROFILER RADAR DATA

Authors:

Yong Zhang, John L. Barron and Robert E. Mercer

Abstract: We describe an application that combines velocity data from a Windprofiler radar and a NEXRADII Doppler radar to compute 3D optical flow of moving severe weather. Windprofiler data improves the recovery of the velocity component in the upwards direction in the optical flow, where Windprofiler data is believed to be more accurate. We demonstrate this quantitatively using synthetic radar data and qualitatively using real radar data from Detroit NCDC Doppler and Harrow Windprofiler radars.
Download

Paper Nr: 111
Title:

LINE SEGMENT BASED STRUCTURE AND MOTION FROM TWO VIEWS - A Practical Issue

Authors:

Saleh Mosaddegh, David Fofi and Pascal Vasseur

Abstract: We present an efficient measure of overlap between two co-linear segments which considerably decreases the overall computational time of a Segment-based motion estimation and reconstruction algorithm already exist in literature. We also discuss the special cases where sparse sampling of the motion space for initialization of the algorithm does not result in a good solution and suggest to use dense sampling instead to overcome the problem. Finally, we demonstrate our work on two real data sets.
Download

Paper Nr: 132
Title:

REAL TIME LOCALIZATION, TRACKING AND RECOGNITION OF VEHICLE LICENSE PLATE

Authors:

A. Shahzad, M. Fraz, M. A. Elahi and M. S. Sarfraz

Abstract: Real time license plate localization, tracking and recognition are reasonably tackled problems with many successful solutions. Though most of these solutions are plausibly fast and efficient, however, almost all of the existing real time systems either deal with only a single problem at a time; detection, tracking, recognition or they are not efficient enough to work well for low quality surveillance videos. The aim of this paper is to address all three tasks for low quality videos in real time. A novel approach is proposed for efficient localization of license plate in video sequence and slightly adapted existing techniques have been applied for tracking and recognition. The implemented system is intelligent enough to automatically adjust for varying camera distances and diverse lighting conditions.

Paper Nr: 140
Title:

SPARSE WINDOW LOCAL STEREO MATCHING

Authors:

Sanja Damjanović, Luuk J. Spreeuwers and Ferdinand van der Heijden

Abstract: We propose a new local algorithm for dense stereo matching of gray images. This algorithm is a hybrid of the pixel based and the window based matching approach; it uses a subset of pixels from the large window for matching. Our algorithm does not suffer from the common pitfalls of the window based matching. It successfully recovers disparities of the thin objects and preserves disparity discontinuities. The only criterion for pixel selection is the intensity difference with the central pixel. The subset contains only pixels which lay within a fixed threshold from the central gray value. As a consequence of the fixed threshold, a low-textured windows will use a larger percentage of pixels for matching, while textured windows can use just a few. In such manner, this approach also reduces the memory consumption. The cost is calculated as the sum of squared differences normalized to the number of the used pixels. The algorithm performance is demonstrated on the test images from the Middlebury stereo evaluation framework.
Download

Paper Nr: 149
Title:

GLOBAL MULTI-VIEW TRACKING UTILIZING COLOR AND TOF CAMERAS BY COMBINING VOLUMETRIC AND PHOTOMETRIC MEASURES

Authors:

Benjamin Langmann, Klaus Hartmann and Otmar Loffeld

Abstract: In this paper a tracking approach designed to utilize multiple cameras with optional depth information, e.g., ToF cameras, structured light cameras and stereo or multi camera setups, is discussed which combines photometric tracking with volumetric tracking. It is able to work with any number and type of cameras. In order to achieve this objective the tracked object is modeled in 3D with an ellipsoid. To make use of the depth information the density of the observed space is modeled with a set of Gaussian kernels for each line of sight. A proposed target configuration is then evaluated by projecting each observed color image onto the ellipsoid and comparing this projection to the expected appearance. Additionally, the density of the space occupied by the ellipsoid is estimated and compared to the expected density. It is demonstrated that by utilizing the depth information in this way ambiguities due to color similarities can be overcome reliably.

Paper Nr: 155
Title:

BALL ROTATION DETECTION BASED ON ARBITRARY FEATURES

Authors:

Alexander Szép

Abstract: This work presents an objective method to detect ball rotation in image sequences. We apply this method to objectively classify racket sports equipment. Therefore, we observe the ball impact on a racket and compare rotation differences detected prior to and after the impact. The method combines ball center tracking with surface corner tracking to calculate ball rotation. Because our method’s application has real-time constraints our rotation detection is fully automatic. The bottom line: Our experimental results enable racket classifications. Athletes and sports federations are therefore our stakeholders.
Download

Paper Nr: 160
Title:

A NON-LINEAR QUANTITATIVE EVALUATION APPROACH FOR DISPARITY ESTIMATION - Pareto Dominance Applied in Stereo Vision

Authors:

Ivan Cabezas and Maria Trujillo

Abstract: Performance evaluation of vision algorithms is a necessary step during a research process. It may supports inter and intra technique comparisons. A fair evaluation process requires of a methodology. Disparity estimation evaluation involves multiple aspects. However, conventional approaches rely on the use of a single value as an indicator of comparative performance. In this paper a non-linear quantitative evaluation approach for disparity estimation is introduced. It is supported by Pareto dominance and Pareto optimal set concepts. The proposed approach allows different evaluation scenarios, and offers advantages over traditional evaluation approaches. The experimental validation is conducted using ground truth data. Innovative results obtained by applying the proposed approach are presented and discussed.
Download

Paper Nr: 168
Title:

PATHOLOGY CLASSIFICATION OF GAIT HUMAN GESTURES

Authors:

Fabio Martínez, Juan Carlos León and Eduardo Romero

Abstract: Gait patterns may be distorted in a large set of pathologies. In the clinical practice, the gait is studied using a set of measurements which allows identification of pathological disorders, thereby facilitating diagnosis, treatment and follow up. These measurements are obtained from a set of markers, carefully placed in some specific anatomical locations. This conventional procedure is obviously invasive and alters the natural movement gestures, a great drawback for diagnosis and management of the early disease stages, when accuracy is a crucial issue. Instead, markerless approaches attempt to capture the very nature of the movement with practically no intervention on the movement patterns. These techniques remain still limited concernig their clinical applications since they do not segment with sufficient precision the human silhouette. This article introduces a novel markerless strategy for classiying normal and pathological gaits, using a temporal-spatial characterization of the subject from 2 differents views. The feature vector is constructed by associating the spatial information obtained with SURF and the temporal information from a ∑-∆ operator. The strategy was evaluated in three groups of patients: normal, musculoskeletal disorders and parkinson’s disease, obtaining a precision and a recall of about 60%
Download