torch_geometric.llm.models.LLMJudge

class LLMJudge(NVIDIA_NIM_MODEL: Optional[str] = 'nvidia/llama-3.1-nemotron-70b-instruct', NVIDIA_API_KEY: Optional[str] = '', ENDPOINT_URL: Optional[str] = 'https://integrate.api.nvidia.com/v1')[source]

Bases: object

Uses NIMs to score a triple of (question, model_pred, correct_answer) This whole class is an adaptation of Gilberto’s work for PyG.

Parameters:

NVIDIA_NIM_MODEL (Optional[str], default: 'nvidia/llama-3.1-nemotron-70b-instruct') – (str, optional) The name of the NVIDIA NIM model to use. (default: “nvidia/llama-3.1-nemotron-70b-instruct”).
NVIDIA_API_KEY (Optional[str], default: '') – (str, optional) The API key for accessing NVIDIA’s NIM models. (default: “”).
ENDPOINT_URL (Optional[str], default: 'https://integrate.api.nvidia.com/v1') – (str, optional) The URL hosting your model, in case you are not using the public NIM. (default: “https://integrate.api.nvidia.com/v1”).

score(question: str, model_pred: str, correct_answer: str) → float[source]

Parameters:

question (str) – The original question asked to the model.
model_pred (str) – The prediction made by the model.
correct_answer (str) – The actual correct answer to the question.

Returns:

score of 0-1, may be nan due to LLM judge failure.: Evals should skip nan’s when aggregating score.

Return type:

score (float)