Tez No İndirme Tez Künye Durumu
401671
Generation and analysis of segmentation trees for natural images /
Yazar:EMRE AKBAŞ
Danışman: PROF. NARENDRA AHUJA
Yer Bilgisi: University of Illinois at Urbana-Champaign / Yurtdışı Enstitü / Elektrik ve Bilgisayar Mühendisliği Ana Bilim Dalı
Konu:Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol = Computer Engineering and Computer Science and Control ; Elektrik ve Elektronik Mühendisliği = Electrical and Electronics Engineering
Dizin:
Onaylandı
Doktora
İngilizce
2011
161 s.
This dissertation is about extracting as well as making use of the structure and hierarchy present in images. We develop a new low-level, multiscale, hierarchical image segmentation algorithm designed to detect image regions regardless of their shapes, sizes, and levels of interior homogeneity. We model a region as a connected set of pixels that is surrounded by ramp edge discontinuities where the magnitude of these discontinuities is large compared to the variation inside the region. Each region is associated with a scale depending on the magnitude of the weakest part of its boundary. Traversing through the range of all possible scales, we obtain all regions present in the image. Regions strictly merge as the scale increases; hence a tree is formed where the root node corresponds to the whole image, and nodes close to the root along a path are large, while their children nodes are smaller and capture embedded details. To evaluate the accuracy and precision of our algorithm, as well as to compare it to the existing algorithms, we develop a new benchmark dataset for low-level image segmentation. In this benchmark, small patches of many images are hand-segmented by human subjects. We provide evaluation methods for both boundary-based and region-based performance of algorithms. We show that our proposed algorithm performs better than the existing low-level segmentation algorithms on this benchmark. Next, we investigate the segmentation-based statistics of natural images. Such statistics capture geometric and topological properties of images, which is not possible to obtain using pixel-, patch-, or subband-based methods. We compile and use segmentation statistics from a large number of images, and propose a Markov random field based model for estimating them. Our estimates confirm some of the previous statistical properties of natural images as well as yield new ones. To demonstrate the value of the statistics, we successfully use them as priors in image classification and semantic image segmentation. We also investigate the importance of different visual cues to describe image regions for solving the region correspondence problem. We design and develop psychophysical experiments to learn the weights of different cues by evaluating their impact on binocular fusibility by human subjects. Using a head-mounted display, we show a set of elliptical regions to one eye and slightly different versions of the same set of regions to the other eye of human subjects. We then ask them whether the ellipses fuse or not. By systematically varying the parameters of the elliptical shapes, and testing for fusion, we learn a perceptual distance function between two elliptical regions. We evaluate this function on ground-truth stereo image pairs. Finally, we propose a novel multiple instance learning (MIL) method. In MIL, in contrast to classical supervised learning, the entities to be classi- fied are called bags, each of which contains an arbitrary number of elements called instances. We propose an additive model for bag classification where we exploit the idea of searching for discriminative instances, which we call prototypes. We show that our bag-classifier can be learned in a boosting framework, leading to an iterative algorithm, which learns prototype-based weak learners that are linearly combined. At each iteration of our proposed method, we search for a new prototype so as to maximally discriminate between the positive and negative bags, which are themselves weighted according to how well they were discriminated in earlier iterations. Unlike previous instance selection based MIL methods, we do not restrict the prototypes to a discrete set of training instances but allow them to take arbitrary values in the instance feature space. We also do not restrict the total number of prototypes and the number of selected-instances per bag; these quantities are completely data-driven. We show that our method outperforms state-of-theart MIL methods on a number of benchmark datasets. We also apply our method to large-scale image classification, where we show that the automatically selected prototypes map to visually meaningful image regions.