torch_geometric.nn.models.GPSE
- class GPSE(dim_in: int = 20, dim_out: int = 51, dim_inner: int = 512, layer_type: str = 'resgatedgcnconv', layers_pre_mp: int = 1, layers_mp: int = 20, layers_post_mp: int = 2, num_node_targets: int = 51, num_graph_targets: int = 11, stage_type: str = 'skipsum', has_bn: bool = True, head_bn: bool = False, final_l2norm: bool = True, has_l2norm: bool = True, dropout: float = 0.2, has_act: bool = True, final_act: bool = True, act: str = 'relu', virtual_node: bool = True, multi_head_dim_inner: int = 32, graph_pooling: str = 'add', use_repr: bool = True, repr_type: str = 'no_post_mp', bernoulli_threshold: float = 0.5)[source]
Bases:
Module
The Graph Positional and Structural Encoder (GPSE) model from the “Graph Positional and Structural Encoder” paper.
The GPSE model consists of a (1) deep GNN that consists of stacked message passing layers, and a (2) prediction head to predict pre-computed positional and structural encodings (PSE). When used on downstream datasets, these prediction heads are removed and the final fully-connected layer outputs are used as learned PSE embeddings.
GPSE also provides a static method
from_pretrained()
to load pre-trained GPSE models trained on a variety of molecular datasets.from torch_geometric.nn import GPSE, GPSENodeEncoder from torch_geometric.transforms import AddGPSE from torch_geometric.nn.models.gpse import precompute_GPSE gpse_model = GPSE.from_pretrained('molpcba') # Option 1: Precompute GPSE encodings in-place for a given dataset dataset = ZINC(path, subset=True, split='train') precompute_gpse(gpse_model, dataset) # Option 2: Use the GPSE model with AddGPSE as a pre_transform to save # the encodings dataset = ZINC(path, subset=True, split='train', pre_transform=AddGPSE(gpse_model, vn=True, rand_type='NormalSE'))
Both approaches append the generated encodings to the
pestat_GPSE
attribute ofData
objects. To use the GPSE encodings for a downstream task, one may need to add these encodings to thex
attribute of theData
objects. To do so, one can use theGPSENodeEncoder
provided to map these encodings to a desired dimension before appending them tox
.Let’s say we have a graph dataset with 64 original node features, and we have generated GPSE encodings of dimension 32, i.e.
data.pestat_GPSE
= 32. Additionally, we want to use a GNN with an inner dimension of 128. To do so, we can map the 32-dimensional GPSE encodings to a higher dimension of 64, and then append them to thex
attribute of theData
objects to obtain a 128-dimensional node feature representation.GPSENodeEncoder
handles both this mapping and concatenation tox
, the outputs of which can be used as input to a GNN:encoder = GPSENodeEncoder(dim_emb=128, dim_pe_in=32, dim_pe_out=64, expand_x=False) gnn = GNN(...) for batch in loader: x = encoder(batch.x, batch.pestat_GPSE) out = gnn(x, batch.edge_index)
- Parameters:
dim_in (int, optional) – Input dimension. (default:
20
)dim_out (int, optional) – Output dimension. (default:
51
)dim_inner (int, optional) – Width of the encoder layers. (default:
512
)layer_type (str, optional) – Type of graph convolutional layer for message-passing. (default:
resgatedgcnconv
)layers_pre_mp (int, optional) – Number of MLP layers before message-passing. (default:
1
)layers_mp (int, optional) – Number of layers for message-passing. (default:
20
)layers_post_mp (int, optional) – Number of MLP layers after message-passing. (default:
2
)num_node_targets (int, optional) – Number of individual PSEs used as node-level targets in pretraining
GPSE
. (default:51
)num_graph_targets (int, optional) – Number of graph-level targets used in pretraining
GPSE
. (default:11
)stage_type (str, optional) – The type of staging to apply. Possible values are:
skipsum
,skipconcat
. Any other value will default to no skip connections. (default:skipsum
)has_bn (bool, optional) – Whether to apply batch normalization in the layer. (default:
True
)final_l2norm (bool, optional) – Whether to apply L2 normalization to the outputs. (default:
True
)has_l2norm (bool, optional) – Whether to apply L2 normalization after
(default (of virtual nodes.) –
True
)dropout (float, optional) – Dropout ratio at layer output. (default:
0.2
)has_act (bool, optional) – Whether has activation after the layer. (default:
True
)final_act (bool, optional) – Whether to apply activation after the layer stack. (default:
True
)act (str, optional) – Activation to apply to layer output if
has_act
isTrue
. (default:relu
)virtual_node (bool, optional) – Whether a virtual node is added to graphs in
GPSE
computation. (default:True
)multi_head_dim_inner (int, optional) – Width of MLPs for PSE target prediction heads. (default:
32
)graph_pooling (str, optional) – Type of graph pooling applied before post_mp. Options are
add
,max
,mean
. (default:add
)use_repr (bool, optional) – Whether to use the hidden representation of the final layer as
GPSE
encodings. (default:True
)repr_type (str, optional) – Type of representation to use. Options are
no_post_mp
,one_layer_before
. (default:no_post_mp
)bernoulli_threshold (float, optional) – Threshold for Bernoulli sampling
(default –
0.5
)
- forward(batch)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.