torch_geometric.llm.models.TXT2KG

class TXT2KG(NVIDIA_NIM_MODEL: Optional[str] = 'nvidia/llama-3.1-nemotron-70b-instruct', NVIDIA_API_KEY: Optional[str] = '', ENDPOINT_URL: Optional[str] = 'https://integrate.api.nvidia.com/v1', local_LM: bool = False, chunk_size: int = 512)[source]

Bases: object

A class to convert text data into a Knowledge Graph (KG) format. Uses NVIDIA NIMs + Prompt engineering by default. Default model nvidia/llama-3.1-nemotron-70b-instruct is on par or better than GPT4o in benchmarks. We need a high quality model to ensure high quality KG. Otherwise we have garbage in garbage out for the rest of the GNN+LLM RAG pipeline.

Use local_lm flag for local debugging/dev. You still need to be able to inference a 14B param LLM, ‘VAGOsolutions/SauerkrautLM-v2-14b-DPO’. Smaller LLMs did not work at all in testing. Note this 14B model requires a considerable amount of GPU memory. See examples/llm/txt2kg_rag.py for an example.

Parameters:
  • NVIDIA_NIM_MODEL (Optional[str], default: 'nvidia/llama-3.1-nemotron-70b-instruct') – str, optional The name of the NVIDIA NIM model to use. (default: “nvidia/llama-3.1-nemotron-70b-instruct”).

  • NVIDIA_API_KEY (Optional[str], default: '') – str, optional The API key for accessing NVIDIA’s NIM models (default: “”).

  • ENDPOINT_URL (Optional[str], default: 'https://integrate.api.nvidia.com/v1') – str, optional The URL hosting your model, in case you are not using the public NIM. (default: “https://integrate.api.nvidia.com/v1”).

  • local_LM (bool, default: False) – bool, optional A flag indicating whether a local Language Model (LM) should be used. This uses HuggingFace and will be slower than deploying your own private NIM endpoint. This flag is mainly recommended for dev/debug. (default: False).

  • chunk_size (int, default: 512) – int, optional The size of the chunks in which the text data is processed (default: 512).

save_kg(path: str) None[source]

Saves the relevant triples in the knowledge graph (KG) to a file.

Parameters:

path (str) – The file path where the KG will be saved.

Returns:

None – None

add_doc_2_KG(txt: str, QA_pair: Optional[Tuple[str, str]] = None) None[source]

Add a document to the Knowledge Graph (KG).

Parameters:
  • txt (str) – The text to extract triples from.

  • QA_pair (Tuple[str, str]], optional) – A QA pair to associate with the extracted triples. Useful for downstream evaluation.

Returns: - None

Return type:

None