Tez No İndirme Tez Künye Durumu
400724
Inferring the binding preferences of RNA-binding proteins /
Yazar:HİLAL KAZAN
Danışman: DR. QUAID MORRIS
Yer Bilgisi: University of Toronto / Yurtdışı Enstitü / Bilgisayar Bilimleri Ana Bilim Dalı
Konu:Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol = Computer Engineering and Computer Science and Control
Dizin:
Onaylandı
Doktora
İngilizce
2012
137 s.
Post-transcriptional regulation is carried out by RNA-binding proteins (RBPs) that bind to speci c RNA molecules and control their processing, localization, stability and degradation. Experimental studies have successfully identi ed RNA targets associated with speci c RBPs. However, because the locations of the binding sites within the targets are unknown and because RBPs recognize both sequence and structure elements in their binding sites, identi cation of RBP binding preferences from these data remains challenging. The unifying theme of this thesis is to identify RBP binding preferences from experimental data. First, we propose a protocol to design a complex RNA pool that represents diverse sets of sequence and structure elements to be used in an in vitro assay to eciently measure RBP binding preferences. This design has been implemented in the RNAcompete method, and applied genome-wide to human and Drosophila RBPs. We show that RNAcompete-derived motifs are consistent with established binding preferences. We developed two computational models to learn binding preferences of RBPs from large-scale data. Our rst model, RNAcontext uses a novel representation of secondary structure to infer both sequence and structure preferences of RBPs, and is optimized for use with in vitro binding data on short RNA sequences. We show that including structure information improves the prediction accuracy signi cantly. Our second model, MaLaRKey, extends RNAcontext to t motif models to sequences of arbitrary length, and to incorporate a richer set of structure features to better model in vivo RNA secondary structure. We demonstrate that MaLaRKey infers detailed binding models that accurately predict binding of full-length transcripts.