Medium 9781601322548

IP, Computer Vision, and Pattern Recognition: The 2013 WorldComp International Conference Proceedings

Views: 4890
Ratings: (0)

New research by international contributors on IP, Computer Vision, and Pattern Recognition

List price: $99.95

Your Price: $79.96

You Save: 20%

Remix
Remove
 

177 Chapters

Format Buy Remix

Session - Stereo, 3D, Depth Algorithms, 3D Image Data Structures, and Applications

PDF

 

Reconfigurable Computing Architecture for Accurate Disparity Map Calculation in Real-Time Stereo Vision

PDF

Int'l Conf. IP, Comp. Vision, and Pattern Recognition | IPCV'13 |

3

Reconfigurable Computing Architecture for Accurate

Disparity Map Calculation in Real-Time Stereo Vision

P. Zicari, H. Lam, and A. George

NSF Center for High-Performance Reconfigurable Computing (CHREC)

Dept. of Electrical and Computer Engineering, University of Florida

Gainesville FL, USA 32611

Abstract - This paper presents a novel hardware architecture using FPGA-based reconfigurable computing (RC) for accurate calculation of dense disparity maps in real-time, stereo-vision systems. Recent stereo-vision hardware solutions have proposed local-area approaches. Although parallelism can be easily exploited using local methods by replicating the window-based image elaborations, accuracy is limited because the disparity result is optimized by locally searching for the minimum value of a cost function. Global methods improve the quality of the stereo-vision disparity maps at the expense of increasing computational complexity, thus making real-time application not viable for conventional computing.

 

Collision Prediction from Binocular Optical Flows

PDF

Int'l Conf. IP, Comp. Vision, and Pattern Recognition | IPCV'13 |

11

Collision Prediction from Binocular Optical Flows

F. Mori1, and N. Sugano2

Brain Science Institute, Tamagawa University, Tokyo, Japan

2

Faculty of Engineering, Tamagawa University, Tokyo, Japan

1

eyes, as shown by Lages and Heron (2010)[10] and Morgan and Castet (1977)[11].

Abstract - In the field of motion perception, the “aperture problem” exists in which the true motion direction of a point at a straight line edge on a retinal image cannot be determined by analysis of a local area only. Many straight line edges are found in a real environment. In this paper, we theoretically show that the aperture problem can be solved based on binocular apparent optical flows by the analysis of local area alone when objects and the observer (ego) move on a plane, but it is not solved completely when they move in an arbitrary 3D direction. The solution is applied to the prediction of collision location and collision time for objects and the observer moving on a horizontal plane which is an important function of human. A fairly precise real time solution can be obtained in a system composed of a stereo camera and a simple robot that is commercially available.

 

3D Active Shape Models Integrating Robust Edge Identification and Statistical Shape Models

PDF

Int'l Conf. IP, Comp. Vision, and Pattern Recognition | IPCV'13 |

3D Active Shape Models Integrating Robust Edge Identification and Statistical Shape Models

Brent C. Munsell1 , Martin Styner2,3 , Heather Hazlett3 , and Song Wang4

1 Department of Mathematics and Computer Science, Claflin University, Orangeburg, SC, USA

2 Department of Computer Science, University of North Carolina, Chapel Hill, NC, USA

3 Department of Psychiatry, University of North Carolina, Chapel Hill, NC, USA

4 Department of Computer Science, University of South Carolina, Columbia, SC, USA

Abstract— Based on the Point Distribution Model (PDM),

Active Shape Model (ASM) is an iterative algorithm used to detect structures of interest from images. However, current

ASM methods are sensitive to image noise that may trap the ASM to false edges and/or lead to a structure not within the shape space defined by the PDM. Such problems are particularly serious when segmenting 3D anatomical surface structures from 3D medical images. In this paper we propose two strategies to improve the performance of 3D ASM: (a) developing a robust edge-identification algorithm to reduce the risk of detecting false edges, and (b) integrating the edge-fitting error and statistical shape model defined by a PDM into a unified cost function. We apply the proposed ASM to the challenging tasks of detecting the left hippocampus and caudate surfaces from an subset of 3D pediatric MR images and compare its performance with a recently reported atlas-based method.

 

Robust People Tracking Using A Light Coding Depth Sensor

PDF

22

Int'l Conf. IP, Comp. Vision, and Pattern Recognition | IPCV'13 |

Robust People Tracking Using A Light Coding Depth Sensor

Xun Changqing1 , Yang Shuqiang2 , and Zhang Chunyuan1

1 College of Computer, National University of Defence Technology, ChangSha, China

2 College of Electronic Science and Engineering, National University of Defence Technology, ChangSha, China

Abstract— People tracking plays a key role in real-time automated surveillance system. Occlusion handling is facilitated by using depth sensors. However, when persons touch, it is still hard to identified them even in depth image.

When someone have touched another, their heads usually keep distant from each other. And the shape of heads in top view is very stable. So we treat people tracking as multiple heads tracking in top view image, and present a directed semicircular template based multiple heads tracking method using a novel light coding depth sensor. We implement our method to run in real-time on PC. Our experiment shows our method can track heads reliably in spite of the position, pose of persons and the collision between them.

 

Enhanced Pre-conditioning Algorithm for the Accurate Alignment of 3D Range Scans

PDF

Int'l Conf. IP, Comp. Vision, and Pattern Recognition | IPCV'13 |

29

Enhanced Pre-conditioning Algorithm for the Accurate

Alignment of 3D Range Scans

Shane Transue and Min-Hyung Choi

Department of Computer Science and Engineering

University of Colorado Denver

Denver, CO, United States

Abstract – The process of accurately aligning 3D range scans to reconstruct a virtual model is a complex task in generic circumstances. Yet by exploiting the data characteristics common to many mobile 3D scanning devices, we propose a two phase alignment solution that improves the alignment provided by the iterative closest point (ICP) algorithm. Current approaches target how the ICP algorithm aligns two range scans based on modifying minimization functions, sampling functions, and point correspondence techniques. However, while these approaches have provided subtle improvements in the alignment process, the ICP algorithm is still incapable of aligning low resolution range scans with very little overlap. Based on our proposed algorithm, we are able to increase the accuracy of the alignment provided by the ICP algorithm by 40% on low resolution scan pairs and we demonstrate the versatility of this approach by accurately aligning a variety scan pairs with small overlap regions.

 

Voxel-based object representation by means of edging trees

PDF

36

Int'l Conf. IP, Comp. Vision, and Pattern Recognition | IPCV'13 |

Voxel-based object representation by means of edging trees

L. A. Martínez1 , E. Bribiesca2 , and A. Guzmán3

1 Instituto de Astronomía,

Universidad Nacional Autónoma de México,

México, D. F., México

2 Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas,

Universidad Nacional Autónoma de México,

México, D. F., México

3 Centro de Investigación en Computación,

Instituto Politécnico Nacional,

México, D. F., México

Abstract— A method is described for representing voxelbased objects (VBOs) by means of edging trees (EdTs).

Given a VBO, an EdT is a tree which traces the borders of the object. The vertices of the EdT correspond to the vertices of the enclosing surface where some of them have been conveniently hidden in order to get a 1D representation.

The computed EdT is represented by a base-five digit chain code descriptor suitably combined by means of parentheses.

The EdT notation is invariant under rotation and translation, using this notation it is possible to obtain the mirror image of any VBO with ease. The EdT notation preserves the shape of

 

On Real-Time LIDAR Data Segmentation and Classification

PDF

42

Int'l Conf. IP, Comp. Vision, and Pattern Recognition | IPCV'13 |

On Real-Time LIDAR Data Segmentation and

Classification

Dmitriy Korchev1, Shinko Cheng2, Yuri Owechko1, and Kyungnam (Ken) Kim1

1

Information Systems Sciences Lab., HRL Laboratories, LLC, Malibu, CA, USA

2

Social, Google Inc., Mountain View, CA, USA

Abstract - We present algorithms for fast segmentation and classification of sparse 3D point clouds from rotating LIDAR sensors used for real-time applications such as autonomous mobile systems. Such systems must continuously process large amounts of data with update rates as high as 10 frames per second which makes complexity and performance of the algorithms very critical. Our approach to the segmentation of large and sparse point clouds is efficient and accurate which frees system resources for implementing other more demanding tasks such as classification. Segmentation is the emphasis of this paper as a necessary important first step for subsequent classification and further processing. We propose methods for segmenting sparse point clouds from rotating

 

Denoising Time-of-Flight Depth Maps Using Temporal Median Filter

PDF

Int'l Conf. IP, Comp. Vision, and Pattern Recognition | IPCV'13 |

DENOISING TIME-OF-FLIGHT DEPTH MAPS USING TEMPORAL

MEDIAN FILTER

1

Fang-Yu Lin, 1,†Yi-Leh Wu, 1Wei-Chih Hung

1

Department of Computer Science and Information Engineering

National Taiwan University of Science and Technology, Taiwan

E-mail: ywu@csie.ntust.edu.tw

ABSTRACT

In many types of 3D cameras, The Time-of-Flight

(TOF) cameras have the advantages of simplicity for use and lower price for general public. The TOF cameras can obtain depth maps at video speed. However, the

TOF cameras suffer from low resolution and high random noise. In this paper, we propose methods to reduce the random noise in depth maps captured by the

TOF cameras. For each point in the noisy TOF depth map, we substitute the depth value with the median depth value of its corresponding points in temporally consecutive depth maps captured by the TOF cameras.

The proposed methods require only the depth data captured by the TOF cameras without any extra information, such as illumination, geometric shape, or complex parameters. Experiments results suggest that the proposed temporal denoising methods can effective reduce the noise in TOF depth maps for up to 44 percent.

 

View Synthesis Based on Depth Information and Graph Cuts for 3DTV

PDF

56

Int'l Conf. IP, Comp. Vision, and Pattern Recognition | IPCV'13 |

View Synthesis Based on Depth Information and Graph

Cuts for 3DTV

Anh Tu Tran, Koichi Harada

Graduate School of Engineering, Hiroshima University, Hiroshima, Japan

Abstract - This paper presents a novel method that synthesizes a free-viewpoint based on multiple textures and depth maps in multi-view camera configuration. This method solves the cracks and holes problem due to sampling rate by performing an inverse warping to retrieve texture images. This step allows a simple and accurate resampling of synthetic pixel. To enforce the spatial consistency of color and remove the pixels wrapped incorrectly because of inaccuracy depth maps, we propose some processing steps. The warped depth and warped texture images are used to classify pixels as stable, unstable and disoccluded pixels. The stable pixels are used to create an initial new view by weighted interpolation. To refine the new view, Graph cuts is used to select the best candidates for each unstable pixel. Finally, the remaining disoccluded regions are filled by our inpainting method based on depth information and texture neighboring pixel value. Our experiment on several multi-view data sets is encouraging in both subjective and objective results.

 

Spatially Important Point Identification: A New Technique for Detail-Preserving Reduced-Complexity Representation of 3D Point Clouds

PDF

Int'l Conf. IP, Comp. Vision, and Pattern Recognition | IPCV'13 |

Spatially Important Point Identification: A New

Technique for Detail-Preserving Reduced-Complexity

Representation of 3D Point Clouds

Rohit Sant1, Ninad Kulkarni1, Kratarth Goel2, Salil Kapur2 and Ainesh Bakshi3

1

Department of EEE&I, BITS-Pilani K.K. Birla Goa Campus, Goa, India

2

Department of CS/IS, BITS-Pilani K.K. Birla Goa Campus, Goa, India

3

Department of Chemical Engineering, BITS-Pilani K.K. Birla Goa Campus, Goa, India

Abstract - This paper describes a technique for reducing the inherent complexity of range data without discarding any essential information. Our technique implements this by searching for ‘important’ points in the range data and discarding intervening points, all of which may be regenerated to a good approximation by linear interpolation. The implementation uses a metric based on the 3D geometry of the scene to assign to each point an ‘importance’ value. We define

Spatially Important Points, which are got by comparing this importance value with a customizable template and with importance values of its neighbours. The algorithm has been tested on various datasets and has been found to give, on an average, a 78% reduction in complexity while retaining almost all points of significance, as shown by reconstructing the dataset. Results have been tabulated at the end of the paper.

 

A dynamic background subtraction method for detecting walkers using mobile stereo-camera

PDF

Int'l Conf. IP, Comp. Vision, and Pattern Recognition | IPCV'13 |

69

A dynamic background subtraction method for detecting walkers using mobile stereo-camera

Masaki Kasahara1 and Hiroshi Hanaizumi1

Hosei University Graduate School of Computer and Information Sciences, Tokyo, Japan

1

Abstract - A method “Dynamic Background Subtraction”

(DBS) was proposed for detecting walkers running out into the street using a stereo-camera. The method was based on the fact that front street scene extended from a point as the automobile moving. Analyzing the scene extensions, current scene was precisely predicted from the previous one. The difference between the two scenes indicated walkers’ movements on a street in a video frame interval. The stereocamera provided us depth information for the scene prediction and that for distance to walker. The proposed method was characterized by its simplicity in principle and high potentiality in easily realizing the system with low cost. In this paper, the principle and the procedures of the method were described. Some experimental results were also shown.

 

Performance Evaluation of Depth Map Upsampling on 3D Perception of Stereoscopic Images

PDF

74

Int'l Conf. IP, Comp. Vision, and Pattern Recognition | IPCV'13 |

Performance Evaluation of Depth Map Upsampling on

3D Perception of Stereoscopic ImagesG

Jong In Gil and Manbae Kim

Dept. of Computer and Communications Engineering, Kangwon National University

Chunchon, Republic of Korea

E-mail: manbae@kangwon.ac.kr

Abstract – The depth map upsampling has gained much interest following the release of TOF camera and range sensors. Most of research works have focused on the comparison of an upsampled depth map and its original depth.

A frequently used objective measurement is PSNR. Using this measurement, they examine the performance of their methods.

In 3D stereoscopic field, the depth map plays an important role in the performance of 3D perception. The quality of the depth map is related to 3D perception and therefore it is important that investigate the mutual relation between the upsampled depth map and 3D perception. This important subject has often ignored by most of the depth map upsampling works. In this paper, we implement diverse upsampling methods and find the relation between 3D perception and objective measurement tools such as PSNR, sharpness degree, and blur metric. Experimental results demonstrate that edge PSNR map is important to 3D perception rather than sharpness degree or blur metric.

 

Relative Depth Estimation using a Rotating Camera System

PDF

80

Int'l Conf. IP, Comp. Vision, and Pattern Recognition | IPCV'13 |

Relative Depth Estimation using a Rotating Camera

System

Pallav Garg, Suresh Yerva and Krishnan Kutty

Centre for Research in Engineering Sciences and Technology (CREST)

KPIT Cummins Infosystems Ltd.

Pune, India

{Pallav.Garg, Suresh.Yerva and Krishnan.Kutty}@kpitcummins.com

Abstract— In the proposed method, the relative depth of the objects present in a 3D scene is calculated, using 2D images captured from a camera placed on a rotating platform. Unlike the conventional stereo imaging system, the proposed method captures multiple views of a 3D scene, each view taken at a different camera positions while on the rotating platform. The approach adapted calculates the disparity between the corresponding pixels present in both the views to get the relative depth of the objects. The relative depth of the objects can be calculated for the objects which are in common FOV (Field of View) of both the views captured. By virtue of the rotating platform, the proposed system is capable of creating a depth map in full 360 degree field of view unlike its traditional counterpart. The approach presented is a thereby also cost effective way for depth estimation since only one camera is being used.

 

Session - Surveillance, Safety, Security Applications, and Related Methods and Systems

PDF

 

A Secure ID Card based Authentication System using Watermarking

PDF

Int'l Conf. IP, Comp. Vision, and Pattern Recognition | IPCV'13 |

89

A Secure ID Card based Authentication System using

Watermarking

Peyman Rahmati1, Thomas Tran2, and Andy Adler1

Dept. of Systems and Computer Eng., Carleton University, Ottawa, ON., Canada

2

School of Electrical Eng. and Computer Science, University of Ottawa, Ottawa, ON., Canada

1

Abstract - This paper shows a watermark-based approach to protect digital identity documents against a Print-Scan (PS) attack. We propose a secure ID card authentication system based on watermarking. For authentication purposes, a user/customer is asked to upload a scanned picture of a passport or ID card through the internet to fulfill a transaction online. To provide security in online ID card submission, we need to robustly encode personal information of ID card’s holder into the card itself, and then extract the hidden information correctly in a decoder after the PS operation. The PS operation imposes several distortions, such as geometric, rotation, and histogram distortion, on the watermark location, which may cause the loss of information in the watermark. An online secure authentication system needs to first eliminate the distortion of the PS operation before decoding the hidden data. This study proposes five preprocessing blocks to remove the distortions of the PS operation: filtering, localization, binarization, undoing rotation, and cropping. Experimental results with 100 ID cards showed that the proposed online ID card authentication system has an average accuracy of 99% in detecting hidden information inside ID cards after the PS process. The innovations of this study are the implementation of an online watermark-based authentication system which uses a scanned

 

Smoke Detection in Video Surveillance Using Optical Flow and Green's Theorem

PDF

96

Int'l Conf. IP, Comp. Vision, and Pattern Recognition | IPCV'13 |

Smoke Detection in Video Surveillance Using Optical Flow and Green’s

Theorem

Melih Altun and Mehmet Celenk

School of Electrical Engineering and Computer Science

Stocker Center, Ohio University

Athens, OH 45701 USA

{ma231709, celenk}@ohio.edu http://www.ohio.edu

Abstract - Finding smoke in surveillance videos can be crucial in early detection of fire emergencies. Such early detections improve damage prevention and control by enabling the authorities to take the necessary precautionary steps. This paper describes a smoke detection technique developed for videos taken in visual band. The method makes use of optical flow and color filtering to detect smoke covered regions and the associated smoke sources. Next it extracts dynamic smoke features such as average upwards motion above the source and divergence around the source via Green’s theorem. This determines whether the selected region contains smoke. In turn, the extracted dynamic characteristics of the smoke pattern greatly improve detection accuracy of the method and produce highly robust results as demonstrated in the experimental results.

 

A Walkthrough System to Display Video Corresponding to the Viewer's Face Orientation

PDF

102

Int'l Conf. IP, Comp. Vision, and Pattern Recognition | IPCV'13 |

A Walkthrough System to Display Video

Corresponding to the Viewer’s Face Orientation

T. Watanabe1, C. Liu2, and S. Shibusawa2

Graduate School of Science and Engineering, Ibaraki University, Hitachi, Ibaraki, Japan

2

Department of Computer and Information Sciences, Ibaraki University, Hitachi, Ibaraki, Japan

1

Abstract - Walkthrough systems have hitherto been developed with the ability to produce a virtual display of a person’s field of view. If the actual position and orientation of a person’s face can be ascertained, then walkthrough images could be created to match this person’s face direction, thereby recreating video corresponding to the view from this person’s face direction. A possible application of this system is in store surveillance systems, where the direction in which someone is looking can be estimated to visualize what they are looking at in order to identify suspicious behavior. Therefore, we aim to produce a walkthrough system that uses image processing to acquire a person’s position and face direction, and displays video corresponding to this person’s face direction. In this system, multiple omnidirectional cameras are used to estimate the positions of people based on the bearings of moving bodies acquired by each camera. People’s face orientations are estimated from the skin regions of their faces. We have created a system that uses a person’s position and face orientation obtained in this way to provide an observer with walkthrough video corresponding to the person’s face orientation. To evaluate this system, we conducted experiments on the accuracy of human positions and face orientations and the accuracy of the output video, and we confirmed that the system can display video images corresponding to a person’s face direction.

 

Load more


Details

Print Book
E-Books
Chapters

Format name
PDF
Encrypted
No
Sku
B000000030940
Isbn
9781601322548
File size
69.7 MB
Printing
Allowed
Copying
Allowed
Read aloud
Allowed
Format name
PDF
Encrypted
No
Printing
Allowed
Copying
Allowed
Read aloud
Allowed
Sku
In metadata
Isbn
In metadata
File size
In metadata