WA-GNN: accelerating graph neural networks with tensor core optimization

Liu, Yang

WA-GNN: accelerating graph neural networks with tensor core optimization

dc.contributor.advisor	Atoofian, Ehsan
dc.contributor.author	Liu, Yang
dc.date.accessioned	2026-02-10T15:14:51Z
dc.date.created	2025
dc.date.issued	2025
dc.description	Thesis is embargoed until December 12, 2026
dc.description.abstract	Graph Neural Networks (GNNs) have been widely applied in various domains, such as social network classification, biological prediction, and financial fraud detection, among others, offering effective solutions for non-Euclidean problems. A typical GNN consists of two major phases: combination and aggregation. In the combination phase, the original feature vectors are processed by a deep neural network with learnable weights, typically a multi-layer perceptron (MLP), to generate new embeddings. This phase can efficiently utilize Tensor Cores, specialized matrix computation units in modern Graphics Processing Units (GPUs) optimized for high-throughput computation. In contrast, the aggregation phase collects feature data from neighbouring nodes based on the sparse adjacency structure, leading to irregular data access that significantly limits Tensor Core utilization. Consequently, the overall performance of GNNs on GPUs is primarily constrained by the inefficient aggregation phase, where sparse computation patterns hinder hardware utilization. To address this challenges, we propose WA-GNN (Warp-Specialization Accelerated GNNs), a Tensor Core–accelerated framework designed to fully exploit Tensor Core capabilities for GNN inference. Our approach introduces the K-Concat data format to reorganize the adjacency matrix into a Tensor Core-friendly layout. A warp specialization mechanism is designed to optimize the data loading and computation, while a C-allocation strategy is employed to assign warp workloads. These techniques are integrated into three representative GNN models, Graph Convolutional Network (GCN), Graph Isomorphism Network (GIN), and Graph Attention Network (GAT), each implemented with a customized kernel. Experimental results on multiple benchmark datasets demonstrate that WA-GNN achieves an average of 2 × end-to-end speedup over other baselines across datasets for the GCN model, with the performance gap widening as the dataset size increases. For the GIN model, WAGNN delivers comparable performance to Deep Graph Library (DGL) on the H100 GPU. For the GAT model, WA-GNN achieves an average of 3 × speedup across datasets. These results demonstrate WA-GNN effectiveness in leveraging Tensor Cores for GNN workloads, and similar sparse matrices operations.
dc.identifier.uri	https://knowledgecommons.lakeheadu.ca/handle/2453/5553
dc.language.iso	en
dc.title	WA-GNN: accelerating graph neural networks with tensor core optimization
dc.type	Thesis
etd.degree.discipline	Electrical and Computer Engineering
etd.degree.grantor	Lakehead University
etd.degree.level	Master
etd.degree.name	Master of Science in Electrical and Computer Engineering

Files

Original bundle

Now showing 1 - 1 of 1

Name:: LiuY2025m-2b.pdf
Size:: 4.37 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.23 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Electronic Theses and Dissertations from 2009