Home

The Consortium of Molecular Design at BYU provides cutting edge interdisciplinary research opportunities for students to push the envelope for protein engineering and drug discovery.

We use close collaboration between laboratories at BYU in Physics, Chemistry, Computer Science, LifeSciences, and Engineering to tackle these challenging topics from all angles.

We actively seek industrial collaboration and support for our efforts and are excited to explore mutually beneficial application of all state-of-the-art technologies to revolutionize molecular design.


News and Events

Selected Publications

Thumbnail of figure from publication
By T. J. Hart, Chloe Engler Hart, Aaryn S. Frewing, Paul M. Urie, and Dennis Della Corte
Abstract:

Objectives: Evaluate the gland-level annotations in the PANDA Dataset and provide specific recommendations for the development of an improved prostate adenocarcinoma dataset. Provide insight into why currently developed artificial intelligence (AI) algorithms designed for automatic prostate adenocarcinoma detection have failed to be clinically applicable.

Methods: A neural network model was trained on 5009 Whole Slide Images (WSIs) from PANDA. One expert pathologist repeatedly performed gland-level annotations on 50 PANDA WSIs to create a test set and estimate an intra-pathologist variability value. Dataset labels, expert annotations, and model predictions were compared and analyzed.

Results: We found an intra-pathologist accuracy of 0.83 and Prevalence-Adjusted Bias-Adjusted Kappa (kappa) value of 0.65. The model predictions and dataset labels comparison yielded 0.82 accuracy and 0.64 kappa. The model predictions and dataset labels showed low concordance with the expert pathologist.

Conclusions: Simple AI models trained on PANDA achieve accuracies comparable to intra-pathologist accuracies. Due to variability within or between pathologists these models will unlikely find clinically application. A shift in dataset curation must take place. We urge for the creation of a dataset with multiple annotations from a group of experts. This will enable AI models, trained on this dataset, to produce panel opinions which augment pathological decision making.

Thumbnail of figure from publication
By Aaryn Frewing, Alexander B. Gibson, Richard Robertson, Paul M. Urie, and Dennis Della Corte
Abstract:

Context.—

Automated prostate cancer detection using machine learning technology has led to speculation that pathologists will soon be replaced by algorithms. This review covers the development of machine learning algorithms and their reported effectiveness specific to prostate cancer detection and Gleason grading.

Objective.—

To examine current algorithms regarding their accuracy and classification abilities. We provide a general explanation of the technology and how it is being used in clinical practice. The challenges to the application of machine learning algorithms in clinical practice are also discussed.

Data Sources.—

The literature for this review was identified and collected using a systematic search. Criteria were established prior to the sorting process to effectively direct the selection of studies. A 4-point system was implemented to rank the papers according to their relevancy. For papers accepted as relevant to our metrics, all cited and citing studies were also reviewed. Studies were then categorized based on whether they implemented binary or multi-class classification methods. Data were extracted from papers that contained accuracy, area under the curve (AUC), or κ values in the context of prostate cancer detection. The results were visually summarized to present accuracy trends between classification abilities.

Conclusions.—

It is more difficult to achieve high accuracy metrics for multiclassification tasks than for binary tasks. The clinical implementation of an algorithm that can assign a Gleason grade to clinical whole slide images (WSIs) remains elusive. Machine learning technology is currently not able to replace pathologists but can serve as an important safeguard against misdiagnosis.

Thumbnail of figure from publication
By Connor J. Morris, Jacob A. Stern, Brenden Stark, Max Christopherson, and Dennis Della Corte
Abstract:

Molecular docking tools are regularly used to computationally identify new molecules in virtual screening for drug discovery. However, docking tools suffer from inaccurate scoring functions with widely varying performance on different proteins. To enable more accurate ranking of active over inactive ligands in virtual screening, we created a machine learning consensus docking tool, MILCDock, that uses predictions from five traditional molecular docking tools to predict the probability a ligand binds to a protein. MILCDock was trained and tested on data from both the DUD-E and LIT-PCBA docking datasets and shows improved performance over traditional molecular docking tools and other consensus docking methods on the DUD-E dataset. LIT-PCBA targets proved to be difficult for all methods tested. We also find that DUD-E data, although biased, can be effective in training machine learning tools if care is taken to avoid DUD-E’s biases during training.

Research Opportunities

Dennis Della Corte
Dennis Della Corte (Materials Physics )
  • ProSPr - Protein Structure Prediction

    A cross divisional team of physicists, computer scientists, biologists and chemists implements a novel protein structure prediction pipeline to solve one of the oldest challenges in computational biophysics: The Protein Folding Problem.

    We will apply our pipeline to a global community wide blind test in 2020 called CASP14. 

    The work entails:

    - training of convolutional neural networks

    - design of simulation algorithms

    - high performance super computer usage

    - chemical and biological evaluation of results

  • Radical SAM Engineering

    Together with the Chemistry department at BYU, we are developing  algorithms that aid the systematic design of novel enzymes.

    These enzymes can be applied to a variety of use cases, such as fertilizer production, detergent production, or drug production.