torch_geometric.llm.models.LLMJudge

class LLMJudge(NVIDIA_NIM_MODEL: Optional[str] = 'nvidia/llama-3.1-nemotron-70b-instruct', NVIDIA_API_KEY: Optional[str] = '', ENDPOINT_URL: Optional[str] = 'https://integrate.api.nvidia.com/v1')[source]

Bases: object

Uses NIMs to score a triple of (question, model_pred, correct_answer) This whole class is an adaptation of Gilberto’s work for PyG.

Parameters:
  • NVIDIA_NIM_MODEL (Optional[str], default: 'nvidia/llama-3.1-nemotron-70b-instruct') – (str, optional) The name of the NVIDIA NIM model to use. (default: “nvidia/llama-3.1-nemotron-70b-instruct”).

  • NVIDIA_API_KEY (Optional[str], default: '') – (str, optional) The API key for accessing NVIDIA’s NIM models. (default: “”).

  • ENDPOINT_URL (Optional[str], default: 'https://integrate.api.nvidia.com/v1') – (str, optional) The URL hosting your model, in case you are not using the public NIM. (default: “https://integrate.api.nvidia.com/v1”).

score(question: str, model_pred: str, correct_answer: str) float[source]
Parameters:
  • question (str) – The original question asked to the model.

  • model_pred (str) – The prediction made by the model.

  • correct_answer (str) – The actual correct answer to the question.

Returns:

score of 0-1, may be nan due to LLM judge failure.

Evals should skip nan’s when aggregating score.

Return type:

score (float)