AI-accelerated hypothesis generation

Generative Toolkit for Scientific Discovery

AI-accelerated hypothesis generation

Accelerate hypothesis generation in scientific discovery, with support for generative models across material science and drug discovery. Train generative models, and create and run inference pipelines.

An open-source library to train and use state-of-the-art generative models for hypothesis generation.

Fostering an open community to ease the adoption of state-of-the-art generative models

The GT4SD library provides an effective environment for generating new hypotheses (or inference) and for fine-tuning generative models for specific domains using custom data sets (or retraining). It’s compatible with many popular deep learning frameworks, including PyTorch, PyTorch Lightning, HuggingFace Transformers, GuacaMol, and Moses. GT4SD serves a wide range of applications, ranging from materials science to drug discovery. The common framework makes generative models easily accessible to a broad community, including AI/ML practitioners developing new generative models who want to deploy with just a few lines of code. GT4SD provides a centralized environment for scientists and students interested in using generative models in their scientific research, allowing them to access and explore a variety of different pretrained models. GT4SD provides consistent commands and interfaces for inference and retraining with customizable parameters across the different generative models. The development of problem-specific intelligence is made possible by automatic workflows that allow for retraining with a user’s own data covering molecular structures and properties. The replacement of manual processes and human bias in the discovery process has important effects on applications that rely on generative models, leading to an acceleration of expert knowledge.

Accelerating material design with the generative toolkit for scientific discovery.Manica, M., Born, J., Cadow, J. et al. npj Comput Mater 9, 69 (2023).