Ulusal Tez Merkezi

Tez No	İndirme	Tez Künye	Durumu
58591	Bu tezin, veri tabanı üzerinden yayınlanma izni bulunmamaktadır. Yayınlanma izni olmayan tezlerin basılı kopyalarına Üniversite kütüphaneniz aracılığıyla (TÜBESS üzerinden) erişebilirsiniz.	Pararllel rendering algorithms for distributed-memory multicomputers / Çok işlemcili dağıtık hafızalı bilgisayarlarda paralel görüntüleme algoritmaları Yazar:TAHSİN MERTEFE KURÇ Danışman: DOÇ. DR. CEVDET AYKANAT Yer Bilgisi: İhsan Doğramacı Bilkent Üniversitesi / Mühendislik ve Fen Bilimleri Enstitüsü / Bilgisayar Mühendisliği Ana Bilim Dalı Konu:Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol = Computer Engineering and Computer Science and Control Dizin:Algoritmalar = Algorithms ; Bilgisayar grafikleri = Computer graphics ; Görüntüleme = Imaging ; Paralel bilgisayarlar = Parallel computers ; Paralel işlemciler = Parallel processors	Onaylandı Doktora İngilizce 1997 196 s.

ÖZET çok İşlemcili dagitik-hafizali bilgisayarlarda paralel görüntüleme algoritmaları Tahsin Mertefe Kurç Bilgisayar ve Enformatik Mühendisliği Doktora Tez Yöneticileri: Doç. Dr. Cevdet Aykanat and Prof. Bülent Özgüç Haziran 1997 Bu tezde dağıtık hafızalı çok işlemcili bilgisayarların ışıma yönteminin toplama metodunda, poligon görüntülemede ve hacim görüntülemede kullanımı araştırılmıştır. Toplama metodunda ele alman temel konular durum- katsayı matrisinin hesaplanması ve çözüm adımının hiperküp bağlantılı çoklu bilgisayarlarda paralel olarak yapılmasıdır. Durum-katsayı matrisinin hesaplanmasında işlemciler arası veri aktarımı her işlemcideki hafızanın durum-katsayı matrisi ve ışıma metoduyla görüntülenen ortamı oluşturan ver iler arasında paylaştırılması ile azaltılmıştır. İşlemcilerin daha verimli kullanılabilmesi için dinamik paylaştırma yöntemi uygulanmıştır. Çözüm aşamasında Scaled Conjugate- Gradient metodu başarılı bir şekilde uygulanmıştır. Gauss-Jacobi ve Scaled Conjugate- Gradient metodları için verimli paralel algoritmalar geliştirilmiştir. Durum-katsayı matrisinin hesaplanmasından sonra her işlemcide kalan sıfırdan farklı durum-katsayı değerlerinin işlemciler arasında tekrar dağıtılması ile hemen hemen ideal yük dağılımı sağlanmıştır.İ Poligon görüntüleme konusunda yapılan çalışmalarda parça uzayında paralelleştirme yaklaşımı ele alınmıştır. Parça uzayında paralelleştirmede ortamı oluşturan parçalar işlemciler arasında dağıtılır. Her işlemci kendi parçalarının üzerinde görüntüleme algo ritmalarını çalıştırır. Daha sonra her işlemcideki resimler birleştirilerek son resim ortaya çıkarılır. Bu çalışmada hiperküp bilgisayarında parça uzayında paralelleştirme algoritmaları geliştirilmiştir. Resimlerin birleştirilmesi sırasında işlemciler arasında iletişim hacmini azaltan verimli algoritmalar önerilmiştir, işlemciler arasındaki mesajların kopuk kopuk olmasını önlemek için değiştirilmiş bir görüntüleme algoritması önerilmiştir. Hacim görüntülemede ise ekran uzayında paralelleştirme yaklaşımı araştırılmıştır. Bu yaklaşımda ekran uzayı işlemciler arasında bölünür. Her işlemci kendisine ait olan ekran parçası üzerinde görüntüleme algoritmasını çalıştırtır. Ekranın bölünmesine göre hacim elemanlarıda işlemciler arasında dağıtılır. Bu çalışmada, çeşitli ekran uzayında bölme yöntemleri incelendi ve geliştirildi. Bu yöntemler ekranı hacim elemanlarının ekrandaki dağılımlarına göre bölerek daha iyi yük dağılımı sağlar. Bu yöntemlerden çizge parçalamaya dayalı bölme ve Hubert eğrisine dayalı bölme yeni yöntemlerdir. Bu yöntemler deneysel olarak karşılaştırılmıştır. Ayrıca, bu çalışmada incelenen ve geliştirilen yöntemler poligon görüntülemeye dayalı bir hacim görüntüleme algoritmasına başarı ile uygulanmışladır.

ABSTRACT PARALLEL RENDERING ALGORITHMS FOR DISTRIBUTED-MEMORY MULTICOMPUTERS Tahsin Mertefe Kurç Ph.D. in Computer Engineering and Information Science Supervisors: Assoc. Prof. Cevdet Ay kanat and Prof. Bülent Özgüç June 1997 In this thesis, utilization of distributed memory multicomputers in gathering radios- ity, polygon rendering and volume rendering is investigated. In parallel gathering radiosity, the target issues are the parallelization of the com putation of the form-factor matrix and solution phases on hypercube-connected multi- computers. Interprocessor communication in matrix computation phase is decreased by sharing the memory space between matrix elements and the scene data. A demand- driven algorithm is proposed for better computational load balance during calculation of form- factors. Gauss- Jacobi (GJ) iterative algorithm is used by all of the previous works in the solution phase. We apply more efficient Scaled Conjugate- Gradient (SCG) algorithm in the solution phase. Parallel algorithms were developed for GJ and SCG algorithms for hypercube-connected multicomputers. In addition, load balancing in the solution phase is investigated. An efficient data redistribution scheme is proposed. Thisscheme achieves perfect load balance in matrix- vector product operations in the solution phase. Object-space parallelism is investigated for parallel polygon rendering on hypercube- connected multicomputers. Briefly, in object-space parallelism, scene data is partitioned into disjoint sets among processors. Each processor performs the rendering of its local partition of primitives. After this local rendering phase, full screen partial images in each processor are merged to obtain the final image. This phase is called pixel merging phase. Pixel merging phase requires interprocessor communication to merge partial images. In this work, hypercube interconnection topology and message passing structure are exploited to merge partial images efficiently. Volume of communication in pixel merging phase is decreased by only exchanging local foremost pixels in each processor after local rendering phase. For this purpose, a modified scanline z-buffer algorithm is proposed for the local rendering phase. This algorithm avoids message fragmentation by storing local foremost pixels in consecutive memory locations. In addition, it eliminates initialization of z-buffer, which is a sequential overhead to parallel execution. For pixel merging phase, we propose two schemes referred to here as pairwise exchange scheme and all-to- all personalized communication scheme, which are suited to the hypercube topology. We investigate load balancing in pixel merging phase. Two heuristics, recursive subdivision and heuristic bin packing, were proposed to achieve better load balancing in pixel merging phase. These heuristics are adaptive such that they utilize the distribution of foremost pixels on the screen to subdivide the screen in the pixel merging phase. Image-space parallelism is investigated for parallel volume rendering of unstructured grids. In image-space parallelism, the screen is subdivided into regions. Each processor is assigned one or more subregions. The primitives (e.g., tetrahedrals) in the volume data are distributed among processors according to screen subdivision and processor-subregion assignments. Then, each processor renders its local subregions. The target topic in this work is the adaptive subdivision of the screen. Adaptive subdivision issue has not been investigated in parallel volume rendering of unstructured grids before. Only some re searchers utilized adaptive subdivision in parallel polygon rendering and ray tracing. In this work, several algorithms are proposed to subdivide the screen adaptively. The algorithms presented in this work can be grouped into two classes: 1-dimensional arraybased algorithms and 2-diraensional mesh based algorithms. Among the 2-dimensional mesh based algorithms, graph partitioning based subdivision and Hubert curve based subdivision algorithms are new approaches in parallel rendering field. An experimental comparison of the subdivision algorithms are performed on a common frame work. The subdivision algorithms were employed in the parallelization of a volume rendering algo rithm, which is a polygon rendering based algorithm. In the previous works on parallel polygon rendering, only the number of primitives in a subregion were used to approxi mate the work load of the subregion. We experimentally show that this approximation is not enough. Better speedup values can be obtained by utilizing other criteria such as number of pixels, number of spans in a region. By utilizing these additional criteria, the speedup values are almost doubled.