Ulusal Tez Merkezi

Tez No	İndirme	Tez Künye	Durumu
401351		Contribution to the interpretation of evolving communities in complex networks: Application to the study of social interactions / Yazar:KEZİBAN Danışman: PROF. JEAN-FRANCOIS BOULICAUT ; DR. VINCENT LABATUT Yer Bilgisi: INSTITUT NATIONAL DES SCIENCES APPLIQUEES DE LYON / YURTDIŞI ENSTİTÜ Konu:Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol = Computer Engineering and Computer Science and Control Anahtar Kelime:	Onaylandı Doktora İngilizce 2014 141 s.



Complex Networks constitute a convenient tool to model real-world complex systems. For this reason, they have become very popular in the last decade. It is possible to encode various types of information in a complex network, depending on the studied system and modeling objectives and needs. In its most basic form, a complex network contains only nodes and links (plain network). But it is also possible to enrich it by setting link directions (directed network), associating attributes to nodes (attributed network) or links, or introducing a temporal dimension (dynamic networks). Many tools exist to study complex networks. Among them, community detection is one of the most important. A community is roughly defined as a group of nodes more connected internally than to the rest of the network. In the literature, this intuitive definition has been formalized in many ways, leading to countless different methods and variants to detect communities. In the large majority of cases, the result of these methods is set of node groups in which each node group corresponds to a community. From the applicative point of view, the meaning of these groups is as important as their detection. However, although the task of detecting communities in itself took a lot of attraction, the problem of interpreting them has not been properly tackled until now. In this thesis, we see the interpretation of communities as a problem independent from the community detection process, consisting in identifying the most characteristic features of communities. We break it down into two sub-problems: 1) finding an appropriate way to represent a community and 2) objectively selecting the most characteristic parts of this representation. To solve them, we take advantage of the information encoded in dynamic attributed networks. We propose a new representation of communities under the form of temporal sequences of topological measures and attribute values associated to individual nodes. We then look for emergent sequential patterns in this dataset, in order to identify the most characteristic community features. Our solution to the interpretation problem takes the form of a five-stepped framework. First, we detect the communities and process the nodal topological measures. Second, we create a database of nodal sequences, thanks to these measures as well as the nodal attributes originating from the studied real-world system. Third, we identify the sequential patterns contained in this database. Fourth, we filter them using some objective criteria related to the notions of prevalence (support measure), emergence (growth rate) and closeness. Fifth, we finally select the most characteristic patterns covering each community. Our framework produces the characteristic sequential patterns and also allows identifying outlier nodes. We perform a validation of our framework on artificially generated dynamic attributed networks. At this occasion, we study its behavior relatively to changes in the temporal evolution of the communities, and to the distribution and evolution of nodal features. We also apply our framework to real-world systems: a DBLP network of scientific collaborations, and a LastFM network of social and musical interactions. Our results show that the detected communities are not completely homogeneous, in the sense several node topic or interests can be identified for a given community. Some communities are composed of smaller groups of nodes which tend to evolve together as time goes by, be it in terms of individual (attributes, topological measures) or relational (community migration) features. The detected anomalies generally fit some generic profiles: nodes misplaced by the community detection tool, nodes relatively similar to their communities, but also significantly different on certain features and/or not synchronized with their community evolution, and finally nodes with completely different interests.