Tez No İndirme Tez Künye Durumu
201828
A classification system for the problem of protein subcellular localization / Proteinlerin hücre içi yerleşimlerini bulmak için bir sınıflandırma sistemi
Yazar:GÖKÇEN ALAY
Danışman: PROF. DR. VOLKAN ATALAY ; YRD. DOÇ. DR. TOLGA CAN
Yer Bilgisi: Orta Doğu Teknik Üniversitesi / Fen Bilimleri Enstitüsü / Bilgisayar Mühendisliği Bölümü
Konu:Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol = Computer Engineering and Computer Science and Control
Dizin:
Onaylandı
Yüksek Lisans
İngilizce
2007
102 s.
The focus of this study is on predicting the subcellular localization of a protein. Subcellular localization information is important for protein function annotation which is a fundamental problem in computational biology. For this problem, a classication system is built that has two main parts: a predictor that is based on a feature mapping technique to extract biologically meaningful information from protein sequences and a client/server architecture for search- ing and predicting subcellular localizations. In the rst part of the thesis, we describe a feature mapping technique based on frequent patterns. In the feature mapping technique we describe, frequent patterns in a protein sequence dataset were identied using a search technique based on a priori property and the dis- tribution of these patterns over a new sample is used as a feature vector for classication. The eect of a number of feature selection methods on the classi- cation performance is investigated and the best one is applied. The method is assessed on the subcellular localization prediction problem with 4 compartments (Endoplasmic reticulum (ER) targeted, cytosolic, mitochondrial, and nuclear) and the dataset is the same used in P2SL. Our method improved the overall accuracy to 91.71% which was originally 81.96% by P2SL. In the second part of the thesis, a client/server architecture is designed and implemented based on Simple Object Access Protocol (SOAP) technology which provides a user- friendly interface for accessing the protein subcellular localization predictions. Client part is in fact a Cytoscape plug-in that is used for functional enrichment of biological networks. Instead of the individual use of subcellular localization information, this plug-in lets biologists to analyze a set of genes/proteins under system view.