Using homophily to analyze and develop link prediction models with deep learning framework
Khanam, Kazi Zainab
Master of Science
MetadataShow full item record
Twitter is a prominent social networking platform where users’ short messages or “tweets” are often used for analysis. However, there has not been much attention paid to mining the medical professions, such as detecting users’ occupations from their biographical content. Mining such information can be useful to build recommender systems for cost-effective advertisements. Conventional classifiers can be used to predict medical occupations, but they tend to perform poorly as there are a variety of occupations. As a result, the main focus of the research is to use various deep learning techniques to examine the textual properties of Twitter users’ biographic contents, network properties, and the impact of homophily of Twitter users employed in medical professional fields. In Chapter 2, a survey is presented based on the concept of homophily as well as important social network topics that summarize the state of art methods that has been proposed in the past years to identify and measure the effect of homophily in multiple types of social networks. This enables us to find open challenges and directions for future research. In Chapter 3, a model has been developed to identify Twitter users working in medical professional fields by using textual properties of the Twitter Users’ bio contents. We have conducted our analysis by annotating the content of Twitter users’ bios and propose a method of combining word embedding with state-of-art neural network models. Finally, in Chapter 4, the research introduces a link prediction model based on the homophily concept by using the Twitter users’ followers and following IDs identified from Chapter 3. Recent research has centered on analyzing rapidly evolving networks. While predicting links in dynamic networks is difficult, deep learning techniques and network representation learning algorithms, such as Node2vec, have demonstrated significant improvements in prediction accuracy. However, Node2vec’s Stochastic Gradient Descent (SGD) approach is prone to falling into a local optimum, and as a consequence, Node2vec fails to capture the network’s global structure. To address this problem, we propose NODDLE (integration of NOde2vec anD Deep Learning mEthod), a deep learning system in which we combine Node2vec’s features and feed them into a four-layer hidden neural network. integration of NOde2vec anD Deep Learning mEthod (NODDLE) takes advantage of adaptive learning optimizers for improving the performance of link prediction. On different social network datasets, experimental findings show that our approach outperforms conventional methods.