Skip to main content

Module 13 - Graph Neural Networks

Why Graphs Now?

The world is relational. Molecules are atoms connected by bonds. Social platforms are users connected by follows. Fraud rings are accounts connected by shared devices and transactions. Citation networks are papers connected by references. Recommendation engines live on user-item bipartite graphs.

For years, machine learning ignored these relationships. We flattened everything into feature vectors and hoped the model would figure out the structure. It worked - until it did not. Drug discovery models trained on molecular fingerprints missed interaction effects. Fraud detectors trained on per-transaction features missed coordinated rings. Recommendation models trained on user histories missed the "friends who bought this also bought" signal.

Graph Neural Networks (GNNs) fix this. Instead of treating each node in isolation, GNNs propagate information through graph structure. Each node gathers signals from its neighbors, which gathered signals from their neighbors. The result is a representation that encodes not just what a node is, but where it sits in the graph and what surrounds it.

This module goes from first principles to production-scale GNNs. You will understand why spectral convolutions led to GCN, why attention helps in GAT, why inductive learning requires GraphSAGE, and how message passing unifies the entire family. You will train GNNs with PyTorch Geometric and understand systems like PinSage that run on billions of nodes.


What Graph Data Looks Like in Production

Social network: nodes = users, edges = follows / friends
Molecular graph: nodes = atoms, edges = chemical bonds
Knowledge graph: nodes = entities, edges = named relations
Citation network: nodes = papers, edges = citations
Fraud graph: nodes = accounts / devices / IPs, edges = shared attributes
E-commerce: nodes = users + items, edges = purchases / clicks
Protein interaction: nodes = proteins, edges = physical interactions
Road network: nodes = intersections, edges = road segments

Each domain has different structural properties: sparse vs dense, homogeneous vs heterogeneous, static vs dynamic. GNN design choices differ accordingly.


Module Map


Lessons

#LessonCore Concepts
01Why Graphs for MLRelational data, graph ML tasks, why CNNs fail on graphs
02Graph Representation for MLAdjacency matrix, edge list, PyG Data object, message passing intuition
03Graph Convolutional NetworksSpectral convolution, GCN layer, symmetric normalization, over-smoothing
04Graph Attention NetworksAttention over neighbors, GATv2, multi-head attention
05GraphSAGE and Inductive LearningInductive vs transductive, neighbor sampling, PinSage
06Message Passing Neural NetworksMPNN framework, edge features, graph readout, 1-WL expressiveness
07GNNs for RecommendationUser-item bipartite graph, LightGCN, NGCF
08Knowledge Graph EmbeddingsTransE, RotatE, DistMult, link prediction

Key Algorithms at a Glance

GCN Layer

H(l+1)=σ ⁣(D~1/2A~D~1/2H(l)W(l))H^{(l+1)} = \sigma\!\left(\tilde{D}^{-1/2} \tilde{A}\, \tilde{D}^{-1/2} H^{(l)} W^{(l)}\right)

where A~=A+I\tilde{A} = A + I (self-loops added) and D~\tilde{D} is the degree matrix of A~\tilde{A}.

GAT Attention Coefficient

αij=exp ⁣(LeakyReLU ⁣(a[WhiWhj]))kN(i)exp ⁣(LeakyReLU ⁣(a[WhiWhk]))\alpha_{ij} = \frac{\exp\!\left(\text{LeakyReLU}\!\left(\mathbf{a}^\top [\mathbf{W}\mathbf{h}_i \| \mathbf{W}\mathbf{h}_j]\right)\right)}{\sum_{k \in \mathcal{N}(i)} \exp\!\left(\text{LeakyReLU}\!\left(\mathbf{a}^\top [\mathbf{W}\mathbf{h}_i \| \mathbf{W}\mathbf{h}_k]\right)\right)}

MPNN Message Passing

hv(t+1)=Ut ⁣(hv(t),  uN(v)Mt ⁣(hv(t),hu(t),evu))\mathbf{h}_v^{(t+1)} = U_t\!\left(\mathbf{h}_v^{(t)},\; \sum_{u \in \mathcal{N}(v)} M_t\!\left(\mathbf{h}_v^{(t)}, \mathbf{h}_u^{(t)}, \mathbf{e}_{vu}\right)\right)

TransE Scoring

f(h,r,t)=h+rtf(h, r, t) = -\|\mathbf{h} + \mathbf{r} - \mathbf{t}\|


Tools and Libraries

ToolPurpose
PyTorch Geometric (PyG)Main GNN framework - GCNConv, GATConv, SAGEConv, DataLoader
DGLAlternative framework, good for custom ops and heterogeneous graphs
PyKEENKnowledge graph embedding library with 40+ models
NetworkXGraph analysis and preprocessing
OGBOpen Graph Benchmark - standard datasets and leaderboards

What You Will Be Able to Do

After this module you can:

  • Explain why graph structure matters and when tabular models fail
  • Implement GCN, GAT, and GraphSAGE with PyTorch Geometric
  • Train node classification, link prediction, and graph classification models
  • Describe the MPNN framework and analyze GNN expressiveness limits
  • Design a recommendation system using LightGCN on a bipartite graph
  • Implement TransE for knowledge graph link prediction
  • Discuss scalability challenges and solutions (neighbor sampling, mini-batch training)

Interview Relevance

GNNs appear in ML engineer and AI engineer interviews at companies with graph-structured data: Meta (social graph), Pinterest (PinSage), Google (Knowledge Graph), pharma/biotech (molecular ML), fintech (fraud graphs). Expect questions on:

  • Explain the GCN update rule. What does the normalization term do?
  • Why does increasing GNN depth cause over-smoothing?
  • What makes GraphSAGE inductive while vanilla GCN is transductive?
  • How does LightGCN differ from standard GCN for recommendations?
  • What does TransE model? What relationship patterns can it not handle?
© 2026 EngineersOfAI. All rights reserved.