Ulusal Tez Merkezi

Tez No	İndirme	Tez Künye	Durumu
763833		Türk dilinde derin öğrenme ile dudak okuma / Lip reading with deep learning in Turkish language Yazar:HADI Danışman: PROF. DR. ÜSTÜN ÖZEN Yer Bilgisi: ATATÜRK ÜNİVERSİTESİ / SOSYAL BİLİMLER ENSTİTÜSÜ / YÖNETİM BİLİŞİM SİSTEMLERİ ANABİLİM DALI / Yönetim Bilişim Sistemleri Bilim Dalı Konu:Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol = Computer Engineering and Computer Science and Control ; Bilim ve Teknoloji = Science and Technology Anahtar Kelime:Akıllı sistemler = Intelligent systems ; Bilgisayarlı görüntüleme = Computer imaging ; Derin öğrenme = Deep learning ; Görüntü işleme-bilgisayarlı = Image processing-computer assisted ; Yapay zeka = Artificial intelligence ; İnsan bilgisayar etkileşimi dersi = Human-computer interaction course ; İnsan-bilgisayar etkileşimi = Human-computer interaction	Onaylandı Doktora Türkçe 2022 95 s.

Bilgisayarla görme alanındaki en önemli çalışma alanlarından biri insan hareketini tanımaktır. Son yıllarda, dudak okuma ise insan hareketini tanımanın en önemli çalışma konularından biri haline gelmiştir. Dudak okuma işlemi çeşitli uygulamalarda kullanılmaya başlamış ve ses akışı olmayan veya gürültülü ortamlarda önem kazanmıştır. Ayrıca, işitme engelli kişilere yardımcı olabilme boyutu da dudak okumayı son derece önemli hale getirmiştir. Dudak okuma işlemi alfabe, kelime ve cümle düzeyinde uygulanır. Dudak okuma alanı için farklı dillerde farklı veri setleri bulunmaktadır. Ancak Türkçe için herhangi bir veri seti bulunmadığı için bu çalışmanın veri seti yazar tarafından oluşturulmuştur. Bu çalışmanın veri setinde 20 sayı, 19 sıfat, 33 isim ve 19 fiil olmak üzere toplam 91 kelime bulunmaktadır. Çalışma kapsamında 72 kişiden video verisi toplanmıştır. Veri setinin az olması nedeniyle video veriler Camtasia uygulaması ile çoğaltılmıştır. Oluşturulan verilerin 20'si test, 25'i doğrulama ve geri kalanı eğitim seti için kullanılmış olup test veri setinde bulunan veriler daha önce kullanılmamış ve sisteme verilmemiştir. Çalışmanın modeli tasarlandıktan sonra ilk olarak sayılar veri seti üzerinde eğitilerek test edilmiş ve %56.25 başarı elde edilmiştir. Sayılar veri setinden sonra model sıfatlar veri seti üzerinde eğitilerek test edilmiş ve %75 başarı ve daha sonra isimler veri seti üzerinde eğitilerek test edilmiş ve %71.88 başarıya ulaşılmıştır. Fiiller veri setinde ise %79.69 başarı seviyesine ulaşılmıştır.

One of the most important fields of study in computer vision is recognizing human action. In recent years, lip reading has become one of the most important study subjects for human action recognition. The lip reading process has started to be used in various applications and has gained importance in environments with no sound flow or noisy environments. In addition, the dimension of helping hearing impaired people has made lip reading extremely important. The lip reading process is applied at the alphabet, word and sentence level. There are different datasets in different languages for the lip reading studies. However, since there is no data set for Turkish, the data set for this study was created by the author. In the data set of this study, there are a total of 91 words, including 20 numbers, 19 adjectives, 33 nouns and 19 verbs. Within the scope of the study, video data were collected from 72 people. Due to the small dataset, the video data was reproduced with the Camtasia application. 20 of the generated data were used for testing, 25 for validation and the rest for the training set, and the data in the test data set was not used before and was not given to the system. After the model of the study was designed, it was first tested by training on the numbers data set and %56,25 success was achieved. After the numbers data set, the model was tested by training on the adjectives data set and %71,8 success was achieved, and then %71,88 success was achieved by training on the nouns data set. In the verbs data set, a success level of 79,69 was achieved.