torch_geometric.llm

`LargeGraphIndexer`	For a dataset that consists of multiple subgraphs that are assumed to be part of a much larger graph, collate the values into a large graph store to save resources.
`RAGQueryLoader`	Loader meant for making RAG queries from a remote backend.

class LargeGraphIndexer(nodes: Iterable[str], edges: Iterable[Tuple[str, str, str]], node_attr: Optional[Dict[str, List[Any]]] = None, edge_attr: Optional[Dict[str, List[Any]]] = None)[source]

For a dataset that consists of multiple subgraphs that are assumed to be part of a much larger graph, collate the values into a large graph store to save resources.

classmethod from_triplets(triplets: Iterable[Tuple[str, str, str]], pre_transform: Optional[Callable[[Tuple[str, str, str]], Tuple[str, str, str]]] = None) → LargeGraphIndexer[source]

Generate a new index from a series of triplets that represent edge relations between nodes. Formatted like (source_node, edge, dest_node).

Parameters:

triplets (KnowledgeGraphLike) – Series of triplets representing knowledge graph relations. Example: [(“cats”, “eat”, dogs”)]. Note: Please ensure triplets are unique.
pre_transform (Optional[Callable[[TripletLike], TripletLike]]) – Optional preprocessing function to apply to triplets. Defaults to None.

Returns:

Index of unique nodes and edges.

Return type:

LargeGraphIndexer

classmethod collate(graphs: Iterable[LargeGraphIndexer]) → LargeGraphIndexer[source]

Combines a series of large graph indexes into a single large graph index.

Parameters:

graphs (Iterable[LargeGraphIndexer]) – Indices to be combined.

Returns:

Singular unique index for all nodes and edges: in input indices.

Return type:

LargeGraphIndexer

get_unique_node_features(feature_name: str = 'pid') → List[str][source]

Get all the unique values for a specific node attribute.

Parameters:: feature_name (str, optional) – Name of feature to get. Defaults to NODE_PID.
Returns:: List of unique values for the specified feature.
Return type:: List[str]

add_node_feature(new_feature_name: str, new_feature_vals: Union[Sequence[Any], Tensor], map_from_feature: str = 'pid') → None[source]

Adds a new feature that corresponds to each unique node in: the graph.

Parameters:

new_feature_name (str) – Name to call the new feature.
new_feature_vals (FeatureValueType) – Values to map for that new feature.
map_from_feature (str, optional) – Key of feature to map from. Size must match the number of feature values. Defaults to NODE_PID.

Return type:

None

get_node_features(feature_name: str = 'pid', pids: Optional[Iterable[str]] = None) → List[Any][source]

Get node feature values for a given set of unique node ids.: Returned values are not necessarily unique.

Parameters:

feature_name (str, optional) – Name of feature to fetch. Defaults to NODE_PID.
pids (Optional[Iterable[str]], optional) – Node ids to fetch for. Defaults to None, which fetches all nodes.

Returns:

Node features corresponding to the specified ids.

Return type:

List[Any]

get_node_features_iter(feature_name: str = 'pid', pids: Optional[Iterable[str]] = None, index_only: bool = False) → Iterator[Any][source]

Iterator version of get_node_features. If index_only is True, yields indices instead of values.

Return type:: Iterator[Any]

get_unique_edge_features(feature_name: str = 'e_pid') → List[str][source]

Get all the unique values for a specific edge attribute.

Parameters:: feature_name (str, optional) – Name of feature to get. Defaults to EDGE_PID.
Returns:: List of unique values for the specified feature.
Return type:: List[str]

add_edge_feature(new_feature_name: str, new_feature_vals: Union[Sequence[Any], Tensor], map_from_feature: str = 'e_pid') → None[source]

Adds a new feature that corresponds to each unique edge in the graph.

Parameters:

new_feature_name (str) – Name to call the new feature.
new_feature_vals (FeatureValueType) – Values to map for that new feature.
map_from_feature (str, optional) – Key of feature to map from. Size must match the number of feature values. Defaults to EDGE_PID.

Return type:

None

get_edge_features(feature_name: str = 'e_pid', pids: Optional[Iterable[str]] = None) → List[Any][source]

Get edge feature values for a given set of unique edge ids.: Returned values are not necessarily unique.

Parameters:

feature_name (str, optional) – Name of feature to fetch. Defaults to EDGE_PID.
pids (Optional[Iterable[str]], optional) – Edge ids to fetch for. Defaults to None, which fetches all edges.

Returns:

Node features corresponding to the specified ids.

Return type:

List[Any]

get_edge_features_iter(feature_name: str = 'e_pid', pids: Optional[Iterable[Tuple[str, str, str]]] = None, index_only: bool = False) → Iterator[Any][source]

Iterator version of get_edge_features. If index_only is True, yields indices instead of values.

Return type:: Iterator[Any]

to_data(node_feature_name: str, edge_feature_name: Optional[str] = None) → Data[source]

Return a Data object containing all the specified node and: edge features and the graph.

Parameters:

node_feature_name (str) – Feature to use for nodes
edge_feature_name (Optional[str], optional) – Feature to use for edges. Defaults to None.

Returns:

Data object containing the specified node and: edge features and the graph.

Return type:

Data

class RAGQueryLoader(graph_data: Tuple[RAGFeatureStore, RAGGraphStore], subgraph_filter: Optional[Callable[[Data, Any], Data]] = None, augment_query: bool = False, vector_retriever: Optional[VectorRetriever] = None, config: Optional[Dict[str, Any]] = None)[source]

Loader meant for making RAG queries from a remote backend.

property config: Get the config for the RAGQueryLoader.

query(query: Any) → Data[source]

Retrieve a subgraph associated with the query with all its feature attributes.

Return type:: Data

Models

`SentenceTransformer`	A wrapper around a Sentence-Transformer from HuggingFace.
`VisionTransformer`	A wrapper around a Vision-Transformer from HuggingFace.
`LLM`	A wrapper around a Large Language Model (LLM) from HuggingFace.
`LLMJudge`	Uses NIMs to score a triple of (question, model_pred, correct_answer) This whole class is an adaptation of Gilberto's work for PyG.
`TXT2KG`	A class to convert text data into a Knowledge Graph (KG) format.
`GRetriever`	The G-Retriever model from the "G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering" paper.
`MoleculeGPT`	The MoleculeGPT model from the "MoleculeGPT: Instruction Following Large Language Models for Molecular Property Prediction" paper.
`GLEM`	This GNN+LM co-training model is based on GLEM from the "Learning on Large-scale Text-attributed Graphs via Variational Inference" paper.
`ProteinMPNN`	The ProteinMPNN model from the "Robust deep learning--based protein sequence design using ProteinMPNN" paper.
`GITMol`	The GITMol model from the "GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text" paper.

Utils

`KNNRAGFeatureStore`	A feature store that uses a KNN-based retrieval.
`NeighborSamplingRAGGraphStore`	Neighbor sampling based graph-store to store & retrieve graph data.
`DocumentRetriever`	Retrieve documents from a vector database.