torch_geometric.llm.models.TXT2KG
- class TXT2KG(NVIDIA_NIM_MODEL: Optional[str] = 'nvidia/llama-3.1-nemotron-70b-instruct', NVIDIA_API_KEY: Optional[str] = '', ENDPOINT_URL: Optional[str] = 'https://integrate.api.nvidia.com/v1', local_LM: bool = False, chunk_size: int = 512)[source]
Bases:
object
A class to convert text data into a Knowledge Graph (KG) format. Uses NVIDIA NIMs + Prompt engineering by default. Default model nvidia/llama-3.1-nemotron-70b-instruct is on par or better than GPT4o in benchmarks. We need a high quality model to ensure high quality KG. Otherwise we have garbage in garbage out for the rest of the GNN+LLM RAG pipeline.
Use local_lm flag for local debugging/dev. You still need to be able to inference a 14B param LLM, ‘VAGOsolutions/SauerkrautLM-v2-14b-DPO’. Smaller LLMs did not work at all in testing. Note this 14B model requires a considerable amount of GPU memory. See examples/llm/txt2kg_rag.py for an example.
- Parameters:
NVIDIA_NIM_MODEL (
Optional
[str
], default:'nvidia/llama-3.1-nemotron-70b-instruct'
) – str, optional The name of the NVIDIA NIM model to use. (default: “nvidia/llama-3.1-nemotron-70b-instruct”).NVIDIA_API_KEY (
Optional
[str
], default:''
) – str, optional The API key for accessing NVIDIA’s NIM models (default: “”).ENDPOINT_URL (
Optional
[str
], default:'https://integrate.api.nvidia.com/v1'
) – str, optional The URL hosting your model, in case you are not using the public NIM. (default: “https://integrate.api.nvidia.com/v1”).local_LM (
bool
, default:False
) – bool, optional A flag indicating whether a local Language Model (LM) should be used. This uses HuggingFace and will be slower than deploying your own private NIM endpoint. This flag is mainly recommended for dev/debug. (default: False).chunk_size (
int
, default:512
) – int, optional The size of the chunks in which the text data is processed (default: 512).