Since 2013 generative neural networks are used for tasks like generating audio or image data. However, there is no publication which uses their capabilities for de novo ligand and or protein design yet. In this work, a generative neural network is introduced – the PG-VUGAN (progressively growing variational U-NET generative adversarial network) with which it is intended to fill this knowledge-gap.

The PG-VUGAN consumes a rich molecular image (RMI) of either the ligand or the pocket and can generate its complementary counterpart. This is practically demonstrated for de novo ligand design in this paper. The RMI is a new image-based format for molecular structures, which is specifically designed for being performantly processed by convolutional neural networks. Its suitability is demonstrated by developing a state-of-the-art binding-affinity regressor. Summing up, a first step towards artificially generated ligands and proteins via generative neural networks was made.

Protein-ligand interactions control cellular processes and are therefore essential for all living beings. Hence, generating complementary ligands for a protein-structure or vice-versa the prediction of complementary protein-structures for ligands is a desirable intent of science. Possible use-cases for de novo ligand and protein design can be found in all fields of biotechnology and reach from drug discovery and individual medicine up to the creation of artificial enzymes.

Designing these molecules from scratch is challenging; and yet, the technology for de novo design is in its early stages. The reason is, that existing tools rely on the assumptions of experts and on mathematical approximations with which their real physical nature can only be simulated partly. Artificial neural networks promise to pass these limitations.

Excerpt

Inhaltsverzeichnis (Table of Contents)

Introduction
Overview
Basics
- Biological background and terms
  - Proteins
  - The key lock principle
    - Drugs and receptors
    - Intermolecular interactions
  - Enzyme, theozyme and theosite
- Data formats for molecular structures
  - 1D – Arrays
    - Atom list / rich atom list
    - SMILES
    - Descriptors
    - Fingerprints
  - 2D-matrix
    - Adjacency matrix
      - Coulomb matrix
      - Contact map and coevolutionary analysis
    - Images of a visualization tool
  - 3D-Matrix
    - Voxel representations
      - Rich voxel
      - Wavelet
    - GRID maps (3D - pharmacophore)
- Drug and protein design
  - Drug design
    - Structure based drug design
      - Docking and virtual high throughput screening
      - Scoring functions
        
        Assisted model building with energy refinement
      - Incremental construction docking tools and FlexX
      - Evolutionary algorithms and Autodock 4.2
      - Shape-based docking
    - Ligand based drug design
      - Library search
      - Quantitative-structure-activity relationships models
    - De novo drug design via molecular modeling
      - Incremental construction algorithms
        
        LUDI
        
        FlexNovo
      - Evolutionary algorithms
  - Protein design
    - Directed evolution
    - Rational design
    - De novo protein design
      - Rosetta Commons
        
        Rosetta (ab initio) structure prediction
        
        Rosetta Match
        
        RosettaDesign
      - ScaffoldSelection
  - Deep learning
    - Recent architectural enhancements of deep models and new architectures
      - Deep residual learning
      - Inception modules & InceptResNet v2
      - Attention modules
        
        Filter-generating network
        
        Squeeze-and-Excitation block
        
        Spatial transformer
        
        Residual attention module
      - 3D convolutional neural networks
      - Multi-view networks
      - Graph convolutional networks
      - Tree-LSTM
        
        LSTM - cell
        
        N-ary Tree-LSTM cell
    - Generative neural networks
      - Generative adversarial network
      - Autoencoders
        
        Variational autoencoder
        
        Adversarial autoencoder
      - VAEGAN
    - Recent proceedings in generative neural networks.
      - Common issues of training GANs and how to deal with them
        
        Mini batch discrimination
        
        Feature matching
        
        Historical averaging
        
        Noisy labels
        
        Semi-Supervised GAN
        
        Least squares GAN
        
        Wasserstein GAN
        
        WGAN with gradient penalty
      - Recently as useful proven architectures
        
        U-NET
        
        Variational U-NET
        
        Patch networks
        
        Discovery GAN
        
        BicycleGAN
        
        StackGAN
        
        Progressively growing GAN
    - Deep learning for drug discovery
      - De novo drug design via deep learning
        
        SMILES variational autoencoder
        
        Wavelet autoencoder
        
        druGAN
      - Feature regressors for molecular properties
        
        KDEEP: a 3D convolutional network
        
        SchNET: a graph convolutional network
    - Dealing with dataset limitations
      - Data augmentation
      - Transfer learning
      - Multitask learning
  - De novo 3D ligand and protein design via deep learning
    - Overview: applied de novo design
    - Data preparation
      - Datasets
        
        PDBbind
        
        QM9
        
        ZINC
        
        CelebA
      - Test complexes: Ibuprofen, HIV-Integrase and 3-dehydroquinate dehydratase
      - Pose normalization
    - Explorative phase
      - RAL based approaches
        
        SchNET variations
        
        VAE and double VAE(GAN) for protein-complexes with SchNET
        
        Ligand autoencoding with RAL based VAEs
        
        Strategies to tackle the sparsity problem
        
        Conclusion RAL based approach
      - Rich molecular image based approaches
        
        Rich molecular image
        
        VAEGANs for ligands represented as rich molecular image
        
        PDBbind analysis and dataset reduction
        
        RMI based VAEGAN on the filtered dataset
        
        Conclusions for RMI based VAE and VAEGAN approaches
      - VUNET and the VUGAN for de novo design
        
        VUNET
        
        VUGAN
        
        VUGAN trained on the RV format
        
        How to use the VUNET and VUGAN for de novo protein design
        
        Additional use-cases
      - Summary and conclusion of the explorative phase
    - Refinement phase
      - PG-VUGAN for de novo design
        
        Loss functions, penalties, and output variations
      - Improving the rich molecular image format
        
        Rich molecular image with atomic radius
        
        Min-max scaling
        
        RMI for complexes
        
        Ligand vs. complex PCA based pose normalization
        
        Comparing representations for complexes
      - Designing a binding affinity regressor
        
        Convolutional architectures in comparison
        
        Designing a binding affinity regressor
        
        Multi-view networks
      - Compensating the limitations of the PDBbind dataset
        
        Multi-task learning
        
        Network-based transfer learning
        
        Data augmentation
      - Abridgement of the engagements towards increased binding affinity regression performance
        
        MV- DilSEption a model with beneficial contributions
      - Rethinking the PG-VUGAN method
        
        Architecture
        
        Reducing the output channels of the rich molecular image
        
        Transfer learning
        
        Image resizing
        
        Loss contributions
        
        Growing procedure
        
        Initiation criterion
        
        Layer fade-in
        
        Stabilizing the adversarial training
        
        Least-squares GAN
        
        Semi-supervised learning
        
        Mini batch discrimination
        
        Feature matching
        
        Activity penalty for the discriminators feature matching layer
        
        Using a latent feature regressor
        
        Training balancing
        
        Data balancing
        
        Loss normalization
        
        Generator / discriminator training ratio balancing
        
        The approach as pseudo code
        
        Result
      - Summary and conclusion
    - List of abbreviations
    - List of figures
    - List of tables
    - Bibliography
    - Supplementary material

Excerpt out of 167 pages - scroll top

Details

Title: Steps towards de Novo 3D Ligand and Protein Design via Deep Learning
College: University of Tubingen (Faculty of Science / Department of Bioinformatics)
Grade: 1,3
Author: Matthias Rieger (Author)
Publication Year: 2019
Pages: 167
Catalog Number: V926236
ISBN (eBook): 9783346294548
Language: English
Tags: Drug design Protein design Enzyme design de novo drug design generative adversarial networks GAN Progressively growing GAN New datastructures for molecules Protein database U-NET Rich molecular image Rich smiles Binding affinity prediction Drug-Target interaction KDEEP Survey StackGAN Wasserstein GAN Binding affinity regression Multi-view networks
Product Safety: GRIN Publishing GmbH

Quote paper: Matthias Rieger (Author), 2019, Steps towards de Novo 3D Ligand and Protein Design via Deep Learning, Munich, GRIN Verlag, https://www.grin.com/document/926236

Steps towards de Novo 3D Ligand and Protein Design via Deep Learning

Excerpt

Inhaltsverzeichnis (Table of Contents)

Details