torch_geometric.llm.models.LLM

class LLM(model_name: str, num_params: Optional[float] = None, n_gpus: Optional[int] = None, dtype: Optional[dtype] = torch.bfloat16, sys_prompt: Optional[str] = None)[source]

A wrapper around a Large Language Model (LLM) from HuggingFace.

Parameters:

model_name (str) – The HuggingFace model name
num_params (float, optional) – An integer representing how many params the HuggingFace model has, in billions. This is used to automatically allocate the correct number of GPUs needed (using a rough heuristic), given the available GPU memory of your GPUs. If not specified, the number of parameters is determined using the huggingface_hub module.
n_gpus (int, optional) – Number of GPUs to use. Designed for advanced users to select how many GPU’s they want to set this manually and override the automatic set up mechanism.
dtype (torch.dtype, optional) – The data type to use for the LLM. (default :obj: torch.bfloat16)
sys_prompt (str, optional) – A system prompt to use for the LLM. (default: :obj: None)

forward(question: List[str], answer: List[str], context: Optional[List[str]] = None, embedding: Optional[List[Tensor]] = None) → Tensor[source]

The forward pass.

Parameters:

question (list[str]) – The questions/prompts.
answer (list[str]) – The answers/labels.
context (list[str], optional) – Additional context to give to the LLM, such as textified knowledge graphs. (default: None)
embedding (list[torch.Tensor], optional) – RAG embedding tensors, i.e. the embedded form of context. Either context or embedding should be used, not both. (default: None)

Return type:

Tensor

inference(question: List[str], context: Optional[List[str]] = None, embedding: Optional[List[Tensor]] = None, max_tokens: Optional[int] = 128) → List[str][source]

The inference pass.

Parameters:

question (list[str]) – The questions/prompts.
answer (list[str]) – The answers/labels.
context (list[str], optional) – Additional context to give to the LLM, such as textified knowledge graphs. (default: None)
embedding (list[torch.Tensor], optional) – RAG embedding tensors, i.e. the embedded form of context. Either context or embedding should be used, not both. (default: None)
max_tokens (int, optional) – How many tokens for the LLM to generate. (default: 32)

Return type:

List[str]