The Consortium of Molecular Design at BYU provides cutting edge interdisciplinary research opportunities for students to push the envelope for protein engineering and drug discovery.
We use close collaboration between laboratories at BYU in Physics, Chemistry, Computer Science, LifeSciences, and Engineering to tackle these challenging topics from all angles.
We actively seek industrial collaboration and support for our efforts and are excited to explore mutually beneficial application of all state-of-the-art technologies to revolutionize molecular design.
News and Events
Selected Publications
Background
Response curves are widely used in biomedical literature to summarize time-dependent outcomes, yet raw data are not always available in published reports. Meta-analysts must frequently extract means and standard errors from figures and estimate outcome measures like the area under the curve (AUC) without access to participant-level data. No standardized method exists for calculating AUC or propagating error under these constraints.
Methods
We evaluate two methods for estimating AUC from figure-derived data: (1) a trapezoidal integration approach with extrema variance propagation, and (2) a Monte Carlo method that samples plausible response curves and integrates over their posterior distribution. We generated 3,920 synthetic datasets from seven functional response types commonly found in glycemic response and pharmacokinetic research, varying the number of timepoints (4–10) and participants (5–40). All response curves were normalized to a true AUC of 1.0.
Results
The standard method consistently underestimated the true AUC, especially in curves with skewed or long-tailed structures. Monte Carlo method produced near-unbiased estimates with tighter alignment to the known AUC across all settings. Increasing the number of datapoints and participants improved performance for both methods, but the Monte Carlo approach retained robustness even under sparse conditions.
Conclusion
This is the first large-scale benchmarking of AUC estimation accuracy from graphically extracted data. The Monte Carlo method outperforms standard approaches in both accuracy and uncertainty quantification. We recommend its adoption in meta-analytic contexts where only figure-derived data are available and advocate for improved data sharing practices in primary publications.
Background
The integration of artificial intelligence (AI) into healthcare is rapidly advancing, with profound implications for medical practice. However, a gap exists in formal AI education for pre-medical students. This study evaluates the effectiveness of the AI in Medicine Association (AIM), an extracurricular program designed to equip pre-medical students with foundational AI knowledge.
Methods
A quasi-experimental pretest-posttest control group design was employed, comparing knowledge acquisition between students participating in the AIM program (cohort group) and a control group of students not participating. The intervention spanned four weeks and included hands-on AI training, ethical considerations, data preprocessing, and model evaluation. Pretest and posttest assessments measured AI knowledge and pathology-related skills.
Results
Participants in the AIM program demonstrated significant improvements in both AI knowledge and pathology-related scores. The cohort group showed a large effect size across all measured domains, particularly in pathology, with Cohen’s d values ranging from 1.83 to 4.74. Statistical analysis confirmed robust, significant improvements in test scores (t-test and Mann-Whitney U test, p < 0.001). There was no significant correlation between previous AI experience or attitudes toward AI and overall score improvement.
Conclusions
The AIM program effectively improved pre-medical students’ understanding of AI and its application in medicine, particularly in pathology. This study highlights the potential of extracurricular programs to address the need for AI education in medical curricula, especially in the pre-medical phase, and suggests that such initiatives could serve as a model for other institutions seeking to integrate AI education into healthcare training.
TrIP2 is an advanced version of the transformer interatomic potential (TrIP) trained on the expanded ANI-2x data set, including more diverse molecular configurations with sulfur, fluorine, and chlorine. It leverages the equivariant SE(3)-transformer architecture, incorporating physical biases and continuous atomic representations. TrIP was introduced as a highly promising transferable interatomic potential, which we show here to generalize to new atom types with no alterations to the underlying model design. Benchmarking on COMP6 energy and force calculations, structure minimization tasks, torsion drives, and applications to molecules with unexpected conformational energy minima demonstrates TrIP2’s high accuracy and transferability. Direct architectural comparisons demonstrate superior performance against ANI-2x, while holistic model evaluations─including training data and level-of-theory considerations─show comparative performance with state-of-the-art models like AIMNet2 and MACE-OFF23. Notably, TrIP2 achieves state-of-the-art force prediction performance on the COMP6 benchmarks and closely approaches DFT-optimized structures in torsion drives and geometry optimization tasks. Without requiring any architectural modifications, TrIP2 successfully capitalizes on additional training data to deliver enhanced generalizability and precision, establishing itself as a robust and scalable framework capable of accommodating future expansions or applications to new domains with minimal reengineering.
Research Opportunities
- Protein Engineering
We develop and apply AI methods to the design of proteins.
- Data Science in Nutrition
We develop data science tools to understand the link between dietary intakes and health outcomes.
- AI in Medicine
We train AI models for applications in the medical field, particular emphasis on automatic prostate cancer diagnosis.