Ulusal Tez Merkezi

Tez No	İndirme	Tez Künye	Durumu
232599		Vision based sign language recognition: Modeling and recognizing isolated signs with manual and non-manual components / Video tabanlı işaret dili tanıma: El ve el dışı hareketler içeren ayrık işaretlerin modellenmesi ve tanınması Yazar:OYA ARAN Danışman: PROF. LALE AKARUN Yer Bilgisi: Boğaziçi Üniversitesi / Fen Bilimleri Enstitüsü / Bilgisayar Mühendisliği Bölümü / Bilgisayar Mühendisliği Ana Bilim Dalı Konu:Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol = Computer Engineering and Computer Science and Control Dizin:	Onaylandı Doktora İngilizce 2008 169 s.

Bu tezde kamera tabanlı işaret dili tanıma problemi üzerine çalışılmış ve üç alt problemde yoğunlaşılmıştır: (1) belirteçsiz el izleme, (2) çok kipli tümleştirme, (3) tanıma. Bu alt problemler için literatürde sunulan çalışmalara göre daha gelişmiş teknikler önerilmiş ve karşılaştırmalı analizler yapılmıştır. İşaret dilinde eller birbirini ya da yüzü kapatabilir. Bu tür durumlarda da gürbüz izleme yapabilecek bir izleme algoritmasına ihityaç vardır. Bu çalışmada çok sayıda nesnenin takibi sırasında temas ve kapatma durumlarında da gürbüz izleme yapabilen, birleşik parçacık süzgeci tabanlı bir yöntem önerdik. Yapılan testlerde önerilen yöntemin temas ve kapatmaya karşı gürbüz olduğu ve mevcut yöntemlere göre daha iyi çalıştığı gözlendi. İşaret dili, temelinde el hareketleri ve el şekline dayanan fakat bunların yanında yüz mimiklerinin, baş ve vücut hareketlerinin de kullanıldığı görsel bir dildir. Bu çalışmada işaretlerin bu çok kipli yapısını dikkate aldık ve ardışık tümleştirme yöntemi ile inanç tabanlı bir tanıma sistemi geliştirdik. Sonuçlar önerdiğimiz yöntemin literatürdeki diğer tümleştirme yöntemlerine göre daha başarılı olduğunu gösterdi. Bu çalışmada önerdiğimiz bir diğer yöntem ise, üretici ve ayırıcı modellerin birleştirilerek işaret tanıma amaçlı kullanılması üzerinedir. İşaret tanıma probleminde yoğunlukla kullanılan üretici modelleri, ayırıcı modellerin sınıflandırma gücü ile birleştirmek için Fisher çekirdeklerini kullandık ve çok sınıflı sınıflandırma yöntemi önerdik. Deneylerde bu yöntemin üretici ve ayırıcı modellerin güçlü yanlarını tek bir modelde toplayarak sınıflandırma başarısını arttırdığı görülmektedir. Bu çalışma kapsamında ayrıca, çalışmada önerilen yöntemleri ve fikirleri kullanan iki uygulama, işaret dili eğitmeni ve otomatik işaret dili sözlüğü, geliştirilmiştir.

This thesis addresses the problem of vision based sign language recognition and focuses on three main tasks to design improved techniques that increase the performance of sign language recognition systems. We first attack the markerless tracking problem during natural and unrestricted signing in less restricted environments. We propose a joint particle filter approach for tracking multiple identical objects, in our case the two hands and the face, which is robust to situations including fast movement, interactions and occlusions. Our experiments show that the proposed approach has a robust tracking performance during the challenging situations and is suitable for tracking long durations of signing with its ability of fast recovery. Second, we attack the problem of the recognition of signs that include both manual (hand gestures) and non-manual (head/body gestures) components. We investigated multi-modal fusion techniques to model the different temporal characteristics and propose a two-step sequential belief based fusion strategy. The evaluation of the proposed approach, in comparison to other state of the art fusion approaches, shows that our method models the two modalities better and achieves higher classification rates. Finally, we propose a strategy to combine generative and discriminative models to increase the sign classification accuracy. We apply the Fisher kernel method and propose a multi-class classification strategy for gesture and sign sequences. The results of the experiments show that the classification power of discriminative models and the modeling power of generative models are effectively combined with a suitable multi-class strategy. We also present two applications, a sign language tutor and an automatic sign dictionary, developed based on the ideas and methods presented in this thesis.