jkisolo.com

<Foundation Models for Graph and Geometric Deep Learning>

Written on

Foundation models (FMs) in language, vision, and audio have dominated machine learning research in 2024, while FMs for graph-structured data have been slower to develop. This article posits that the era of Graph FMs is upon us and offers examples of their current applications.

Written and edited by Michael Galkin and Michael Bronstein, with significant contributions from Jianan Zhao, Haitao Mao, and Zhaocheng Zhu.

Table of Contents

  1. What are Graph Foundation Models and how to build them?
  2. Node Classification: GraphAny
  3. Link Prediction: Not yet
  4. Knowledge Graph Reasoning: ULTRA and UltraQuery
  5. Algorithmic Reasoning: Generalist Algorithmic Learner
  6. Geometric and AI4Science Foundation Models
    1. ML Potentials: JMP-1, DPA-2 for molecules, MACE-MP-0 and MatterSim for inorganic crystals
    2. Protein LMs: ESM-2
    3. 2D Molecules: MiniMol and MolGPS
  7. Expressivity & Scaling Laws: Do Graph FMs scale?
  8. The Data Question: What should be scaled? Is there enough graph data to train Graph FMs?
  9. Key Takeaways

What are Graph Foundation Models and how to build them?

To clarify what constitutes a "foundational" model, we define it as follows:

“A Graph Foundation Model is a single (neural) model that learns transferable graph representations that can generalize to any new, previously unseen graph.”

Graphs vary widely in form, connectivity, and features, making it challenging for standard Graph Neural Networks (GNNs) to qualify as foundational models. For instance, while heuristics like Label Propagation can operate on any graph, they do not involve any learning, thus failing to meet the criteria. Moreover, the utility of Large Language Models (LLMs) for processing graphs into sequences that retain their symmetries is still under investigation.

A critical aspect of designing Graph FMs is achieving transferable graph representations. As highlighted in a recent ICML 2024 position paper, LLMs can compress text in any language into fixed-size tokens. In contrast, creating a universal featurization method for graphs is complex due to their varied characteristics, such as:

  • A single large graph with specific node features and labels (common in node classification)
  • A large graph lacking node features and classes but containing meaningful edge types (typical for link prediction and KG reasoning)
  • Many smaller graphs with or without features and graph-level labels (common in graph classification and regression)

Research questions remain for the graph learning community regarding the design of Graph FMs:

  1. How can we generalize across graphs with diverse features?
  2. How can we generalize across different prediction tasks?
  3. What should the expressivity of foundational models be?

The following sections will demonstrate that Graph FMs are already in use for specific tasks and domains, highlighting their design choices regarding transferable features and the practical advantages they provide for inductive inference on new, unseen graphs.

Node Classification: GraphAny

Historically, GNN-based node classifiers have been limited to a specific graph dataset. For example, a GNN trained on the Cora graph (2.7K nodes, 1433-dimensional features) cannot easily adapt to another graph like Citeseer, which has 3703-dimensional features and a different number of classes.

GraphAny represents a significant advancement as the first Graph FM that allows a single pre-trained model to perform node classification across any graph, regardless of feature dimensions or class counts. A pre-trained GraphAny model on the Wisconsin dataset can generalize to over 30 other graphs of varying sizes and features, consistently outperforming GCN and GAT architectures trained from scratch.

Setup: Semi-supervised node classification involves predicting labels for target nodes based on a graph (G), node features (X), and a few labeled nodes from (C) classes, with no fixed dimension or unique class count.

Transferability: Instead of creating a universal latent space for all graphs, GraphAny focuses on the interactions among predictions from spectral filters. It applies filters to all nodes, optimizes weights from known labels, and computes pairwise distances to generate predictions. The only learnable element is the attention parameterization, which does not depend on the number of unique classes.

Knowledge Graph Reasoning: ULTRA and UltraQuery

Knowledge graphs contain specific sets of entities and relations, making traditional reasoning models less adaptable to new, unseen graphs. ULTRA is a pioneering foundation model for knowledge graph reasoning, capable of transferring to any multi-relational graph without prior training on specific entities or relations.

Setup: Given a multi-relational graph (G) with (E) nodes and (R) edge types, ULTRA answers queries by returning probabilities over all nodes.

Transferability: ULTRA captures relational interactions across various graphs, allowing it to generalize effectively.

Algorithmic Reasoning: Generalist Algorithmic Learner

The Generalist Algorithmic Learner is a GNN capable of executing multiple algorithmic tasks within a shared latent space. This model demonstrates that similar algorithms can leverage a homogeneous feature space for effective problem-solving.

Geometric and AI4Science Foundation Models

In the realm of Geometric Deep Learning, foundation models are emerging as key tools for predicting molecular properties and protein sequences. The complexity of real-world physical structures necessitates models capable of understanding and processing these intricacies.

ML Potentials: JMP-1, DPA-2 for molecules, MACE-MP-0 and MatterSim for inorganic crystals

Setup: Given a 3D structure, the aim is to predict energy and per-atom forces.

Transferability: These models generalize across various atomistic structures, providing stability for applications in molecular dynamics.

Protein LMs: ESM-2

Setup: Predict masked tokens from protein sequences to gain insights into unseen combinations of amino acids.

Transferability: ESM-2 serves as a versatile tool due to its extensive training data and effectiveness in various applications.

2D Molecules: MiniMol and MolGPS

Setup: Given a 2D molecular structure, the task is to predict properties based on atom and bond types.

Transferability: These models utilize a fixed vocabulary of atom and bond types, facilitating their application across different tasks.

Expressivity & Scaling Laws: Do Graph FMs scale?

Understanding how transformers and GNNs scale is critical. While transformers excel in sequential data, GNNs show promise for graph data due to their linear scaling properties.

The Data Question: What should be scaled? Is there enough graph data to train Graph FMs?

Scaling efforts should focus on enhancing the diversity of graph data patterns rather than merely increasing quantity. The challenge remains whether sufficient data exists to train effective Graph FMs.

Key Takeaways

  1. Generalization across heterogeneous graphs remains challenging.
  2. No universal model currently exists for performing multiple prediction tasks in a zero-shot manner.
  3. Model expressivity needs to balance performance with computational efficiency.
  4. The data landscape for graph models is limited, necessitating advancements in sample-efficient architectures.
  1. Mao, Chen, et al. Graph Foundation Models Are Already Here. ICML 2024
  2. Morris et al. Future Directions in Foundations of Graph Machine Learning. ICML 2024
  3. Zhao et al. GraphAny: A Foundation Model for Node Classification on Any Graph. Arxiv 2024. Code on Github
  4. Dong et al. Universal Link Predictor By In-Context Learning on Graphs, arxiv 2024
  5. Zhang et al. Labeling Trick: A Theory of Using Graph Neural Networks for Multi-Node Representation Learning. NeurIPS 2021
  6. Chamberlain, Shirobokov, et al. Graph Neural Networks for Link Prediction with Subgraph Sketching. ICLR 2023
  7. Zhu et al. Neural Bellman-Ford Networks: A General Graph Neural Network Framework for Link Prediction. NeurIPS 2021
  8. Galkin et al. Towards Foundation Models for Knowledge Graph Reasoning. ICLR 2024
  9. Galkin et al. Zero-shot Logical Query Reasoning on any Knowledge Graph. arxiv 2024. Code on Github
  10. Ibarz et al. A Generalist Neural Algorithmic Learner LoG 2022
  11. Markeeva, McLeish, Ibarz, et al. The CLRS-Text Algorithmic Reasoning Language Benchmark. arxiv 2024
  12. Shoghi et al. From Molecules to Materials: Pre-training Large Generalizable Models for Atomic Property Prediction. ICLR 2024
  13. Zhang, Liu et al. DPA-2: Towards a universal large atomic model for molecular and material simulation, arxiv 2023
  14. Batatia et al. A foundation model for atomistic materials chemistry, arxiv 2024
  15. Yang et al. MatterSim: A Deep Learning Atomistic Model Across Elements, Temperatures and Pressures, arxiv 2024
  16. Rives et al. Biological Structure and Function Emerge from Scaling Unsupervised Learning to 250 Million Protein Sequences. PNAS 2021
  17. Lin, Akin, Rao, Hie, et al. Language models of protein sequences at the scale of evolution enable accurate structure prediction. Science 2023. Code
  18. Morgan HL (1965) The generation of a unique machine description for chemical structures — a technique developed at chemical abstracts service. J Chem Doc 5:107–113.
  19. Kläser, Banaszewski, et al. MiniMol: A Parameter Efficient Foundation Model for Molecular Learning, arxiv 2024
  20. Sypetkowski, Wenkel et al. On the Scalability of GNNs for Molecular Graphs, arxiv 2024
  21. Morris et al. Future Directions in Foundations of Graph Machine Learning. ICML 2024
  22. Liu et al. Neural Scaling Laws on Graphs, arxiv 2024
  23. Frey et al. Neural scaling of deep chemical models, Nature Machine Intelligence 2023

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

Survival Guide for Kind Souls: Embrace Authenticity

Discover how to prioritize yourself while fostering genuine relationships without succumbing to the pitfalls of people-pleasing.

Understanding Boundaries: A Guide for Dismissive Avoidants

Explore how dismissive avoidants can communicate and set boundaries effectively for healthier relationships.

Improving Your Life: 5 Key Realizations You Can Embrace Today

Discover five realizations that can help you start improving your life immediately, focusing on self-prioritization and personal growth.

Learning to Love Myself: A Journey of Self-Acceptance

A reflective dialogue about self-acceptance and personal growth.

Unlocking Your Potential: How to Make $400 a Month on Medium

Discover actionable strategies to earn $400 monthly on Medium, including community engagement and consistent writing practices.

Innovative Side Hustles Gen Z is Embracing You Should Try!

Discover the exciting side hustles Gen Z is exploring, from social media influencing to digital art sales, and learn how to join the movement.

Innovations in Quantum Computing and Genetic Engineering

Explore groundbreaking advancements in quantum computing, genetic engineering, and more, as companies and researchers push the boundaries of technology.

Mobile World Congress 2023: Anticipating Innovations at the Major Mobile Event

Explore expectations and innovations at MWC 2023, the largest mobile event featuring groundbreaking technology and major announcements.