Up next


Understanding Oversmoothing in Graph Neural Networks (GNNs): Insights from Two Theoretical Studies

222 Views
Google TechTalks
1
Published on 26 Jan 2024 / In News & Politics

A Google TechTalk, presented by Xinyi Wu, 2024-01-18 A Google Algorithm Seminar. ABSTRACT: Oversmoothing in Graph Neural Networks (GNNs) refers to the phenomenon where increasing network depth leads to homogeneous node representations. Over the last few years, it has remained as one of the central challenges of building more powerful Graph Neural Networks (GNNs). In this talk, I will discuss two recent papers on this phenomenon and provide some new insights. The first work studies why oversmoothing happens at a relatively shallow depth in GNNs. By carefully analyzing the oversmoothing mechanisms in a stylized formulation, we distinguish between adverse mixing that homogenizes nodes across different classes and beneficial denoising within the same class. We quantify these two effects on random graphs sampled from the Contextual Stochastic Block Model (CSBM) and show that oversmoothing occurs once the mixing effect starts to dominate the denoising effect. We establish that the number of layers required for this transition is O(logN/log(logN)) for sufficiently dense graphs with N nodes. We also extend our analysis to study the effects of Personalized PageRank (PPR), or equivalently, the effects of initial residual connections on oversmoothing, and shed light on when and why they might not be an ideal solution to the problem. In the second work, we study oversmoothing in attention-based GNNs, such as Graph Attention Networks (GATs) and transformers. Treating attention-based GNNs as dynamical systems, our study demonstrates that the graph attention mechanism cannot prevent oversmoothing and loses expressive power exponentially. From a technical point of view, the proposed framework significantly extends the existing results on oversmoothing, and can account for asymmetric, state-dependent and time-varying aggregation operators and a wide range of common nonlinear activation functions, such as ReLU, LeakyReLU, GELU and SiLU. The talk is based on the following papers: https://arxiv.org/abs/2212.10701, https://arxiv.org/abs/2305.16102. Joint works with Amir Ajorlou (MIT), Zhengdao Chen (NYU/Google), William Wang (MIT), Zihui Wu (Caltech) and Ali Jadbabaie (MIT). ABOUT THE SPEAKER: Xinyi Wu is a fourth-year Ph.D. student in the Institute for Data, Systems, and Society (IDSS) at Massachusetts Institute of Technology (MIT), advised by Professor Ali Jadbabaie. She is affiliated with the Laboratory for Information and Decision Systems (LIDS). She is a recipient of the MIT Michael Hammer Fellowship. She is interested in applied graph theory, dynamical systems, networks, and machine learning on graphs. Her work on oversmoothing in GNNs has been awarded as Spotlight paper in NeurIPS 2023.

Show more
0 Comments sort Sort By

Up next