The Consortium of Molecular Design at BYU provides cutting edge interdisciplinary research opportunities for students to push the envelope for protein engineering and drug discovery.
We use close collaboration between laboratories at BYU in Physics, Chemistry, Computer Science, LifeSciences, and Engineering to tackle these challenging topics from all angles.
We actively seek industrial collaboration and support for our efforts and are excited to explore mutually beneficial application of all state-of-the-art technologies to revolutionize molecular design.
News and Events
Selected Publications
CASP (Critical Assessment of Structure Prediction) is an organization aimed at advancing the state of the art in computing protein structure from sequence. In the spring of 2020, CASP launched a community project to compute the structures of the most structurally challenging proteins coded for in the SARS-CoV2 genome. Forty-seven research groups submitted over 3000 three-dimensional models and 700 sets of accuracy estimates on ten proteins. The resulting models were released to the public. CASP community members also worked together to provide estimates of local and global accuracy and identify structure-based domain boundaries for some proteins. Subsequently, two of these structures (ORF3a and ORF8) have been solved experimentally, allowing assessment of both model quality and the accuracy estimates. Models from the AlphaFold2 group were found to have good agreement with the experimental structures, with main chain GDT_TS accuracy scores ranging from 63 (a correct topology) to 87 (competitive with experiment).
The pharmaceutical industry is on the brink of entering into the digital age, yet still suffers from fundamental misconceptions and outdated IT systems that inhibit its progress. Four key criteria are identified that have enabled labs to reach the post-modern stage, which are insights generation through advanced analytics, automatic communication through machine to machine interfaces, removal of boundaries for an open lab, and novel means of ensuring trust through automatic submissions. Further progress in these four areas will enable the pharmaceutical laboratory to enter the digital age. Unfortunately, historical roadblocks in the form of an application-centric mindset have so far stifled progress. However, initiatives that supported other industries on their path into the digital age are introduced and evidences for the benefits of the digital age are provided. These span from advanced analytics, data-centric architecture, metadata supported communication, knowledge assisted submissions, to digital maturity models. It is concluded that executives and lab staff within Pharma needs a transition to a data-centric world view to reap all the benefits of the digital age for faster, better, and cheaper drug development.
The prediction of amino acid contacts from protein sequence is an important problem, as protein contacts are a vital step towards the prediction of folded protein structures. We propose that a powerful concept from deep learning, called ensembling, can increase the accuracy of protein contact predictions by combining the outputs of different neural network models. We show that ensembling the predictions made by different groups at the recent Critical Assessment of Protein Structure Prediction (CASP13) outperforms all individual groups. Further, we show that contacts derived from the distance predictions of three additional deep neural networks—AlphaFold, trRosetta, and ProSPr—can be substantially improved by ensembling all three networks. We also show that ensembling these recent deep neural networks with the best CASP13 group creates a superior contact prediction tool. Finally, we demonstrate that two ensembled networks can successfully differentiate between the folds of two highly homologous sequences. In order to build further on these findings, we propose the creation of a better protein contact benchmark set and additional open-source contact prediction methods.