torch_geometric.datasets.ProteinMPNNDataset
- class ProteinMPNNDataset(root: str, size: str = 'small', split: str = 'train', datacut: str = '2030-01-01', rescut: float = 3.5, homo: float = 0.7, max_length: int = 10000, num_units: int = 150, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None, force_reload: bool = False)[source]
Bases:
InMemoryDataset
The ProteinMPNN dataset from the “Robust deep learning based protein sequence design using ProteinMPNN” paper.
- Parameters:
root (str) – Root directory where the dataset should be saved.
size (str) – Size of the PDB information to train the model. If
"small"
, loads the small dataset (229.4 MB). If"large"
, loads the large dataset (64.1 GB). (default:"small"
)split (str, optional) – If
"train"
, loads the training dataset. If"valid"
, loads the validation dataset. If"test"
, loads the test dataset. (default:"train"
)datacut (str, optional) – Date cutoff to filter the dataset. (default:
"2030-01-01"
)rescut (float, optional) – PDB resolution cutoff. (default:
3.5
)homo (float, optional) – Homology cutoff. (default:
0.70
)max_length (int, optional) – Maximum length of the protein complex. (default:
10000
)num_units (int, optional) – Number of units of the protein complex. (default:
150
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)force_reload (bool, optional) – Whether to re-process the dataset. (default:
False
)