MaskMol: knowledge-guided molecular image pre-training framework for activity cliffs with pixel masking

Fleming N. How artificial intelligence is changing drug discovery. Nature. 2018;557(7706):S55–S55.

Zeng X, Wang F, Luo Y, Kang Sg, Tang J, Lightstone FC, et al. Deep generative molecular design reshapes drug discovery. Cell Rep Med. 2022;3:100794. https://doi.org/10.1016/j.xcrm.2022.100794.

Article
CAS
PubMed
PubMed Central

Google Scholar

Vert JP. How will generative ai disrupt data science in drug discovery? Nat Biotechnol. 2023;41:750–1. https://doi.org/10.1038/s41587-023-01789-6.

Article
PubMed

Google Scholar

Diao Y, Liu D, Ge H, Zhang R, Jiang K, Bao R, et al. Macrocyclization of linear molecules by deep learning to facilitate macrocyclic drug candidates discovery. Nat Commun. 2023;14(1):4552.

CAS
PubMed
PubMed Central

Google Scholar

Flam-Shepherd D, Zhu K, Aspuru-Guzik A. Language models can learn complex molecular distributions. Nat Commun. 2022;13(1): 3293.

CAS
PubMed
PubMed Central

Google Scholar

Mahmood O, Mansimov E, Bonneau R, Cho K. Masked graph modeling for molecule generation. Nat Commun. 2021;12(1):3156.

CAS
PubMed
PubMed Central

Google Scholar

Yang X, Fu L, Deng Y, Liu Y, Cao D, Zeng X. GPMO: Gradient Perturbation-Based Contrastive Learning for Molecule Optimization. In: IJCAI. 2023. pp. 4940–8.

Jin W, Barzilay R, Jaakkola T. Junction tree variational autoencoder for molecular graph generation. In: International conference on machine learning. PMLR; 2018. pp. 2323–32.

Jin W, Barzilay R, Jaakkola T. Hierarchical generation of molecular graphs using structural motifs. In: International conference on machine learning. Online: PMLR; 2020. pp. 4839–48.

Xue D, Zhang H, Chen X, Xiao D, Gong Y, Chuai G, et al. X-MOL: large-scale pre-training for molecular understanding and diverse molecular analysis. Sci Bull. 2022;67(9):899–902.

CAS

Google Scholar

Zhang Z, Liu Q, Wang H, Lu C, Lee CK. Motif-based graph self-supervised learning for molecular property prediction. Adv Neural Inf Process Syst. 2021;34:15870–82.

Google Scholar

You Y, Chen T, Sui Y, Chen T, Wang Z, Shen Y. Graph contrastive learning with augmentations. Adv Neural Inf Process Syst. 2020;33:5812–23.

Google Scholar

Xiang H, Jin S, Xia J, Zhou M, Wang J, Zeng L, et al. An image-enhanced molecular graph representation learning framework. In: Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence. Jeju, Korea: IJCAI; 2024. pp. 6107–15.

Luo S, Chen T, Xu Y, Zheng S, Liu TY, Wang L, et al. One Transformer Can Understand Both 2D & 3D Molecular Data. In: The Eleventh International Conference on Learning Representations. Kigali, Rwanda: ICLR; 2023.

Guo Z, Sharma P, Martinez A, Du L, Abraham R. Multilingual Molecular Representation Learning via Contrastive Pre-training. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Dublin, Ireland: ACL; 2022. pp. 3441–53.

Li H, Zhang R, Min Y, Ma D, Zhao D, Zeng J. A knowledge-guided pre-training framework for improving molecular representation learning. Nat Commun. 2023;14(1):7568.

CAS
PubMed
PubMed Central

Google Scholar

Xiang H, Zeng L, Hou L, Li K, Fu Z, Qiu Y, et al. A molecular video-derived foundation model for scientific drug discovery. Nat Commun. 2024;15(1):9696.

CAS
PubMed
PubMed Central

Google Scholar

Hou L, Xiang H, Zeng X, Cao D, Zeng L, Song B. Attribute-guided prototype network for few-shot molecular property prediction. Brief Bioinform. 2024;25(5): bbae394.

CAS
PubMed
PubMed Central

Google Scholar

Zhang X, Xiang H, Yang X, Dong J, Fu X, Zeng X, et al. Dual-view learning based on images and sequences for molecular property prediction. IEEE J Biomed Health Inform. 2023;28(3):1564–74.

Google Scholar

Xia J, Zhao C, Hu B, Gao Z, Tan C, Liu Y, et al. Mole-bert: Rethinking pre-training graph neural networks for molecules. In: The Eleventh International Conference on Learning Representations. Virtual Event: ICLR; 2022.

Zeng X, Xiang H, Yu L, Wang J, Li K, Nussinov R, et al. Accurate prediction of molecular properties and drug targets using a self-supervised image representation learning framework. Nat Mach Intell. 2022;4(11):1004–16.

Hendrickson JB. Concepts and applications of molecular similarity. Science. 1991;252(5009):1189–90.

Google Scholar

Stumpfe D, Bajorath J. Exploring activity cliffs in medicinal chemistry: miniperspective. J Med Chem. 2012;55(7):2932–42.

CAS
PubMed

Google Scholar

Wedlake AJ, Folia M, Piechota S, Allen TE, Goodman JM, Gutsell S, et al. Structural alerts and random forest models in a consensus approach for receptor binding molecular initiating events. Chem Res Toxicol. 2019;33(2):388–401.

Google Scholar

van Tilborg D, Alenicheva A, Grisoni F. Exposing the limitations of molecular machine learning with activity cliffs. J Chem Inf Model. 2022;62(23):5938–51.

PubMed
PubMed Central

Google Scholar

Deng J, Yang Z, Wang H, Ojima I, Samaras D, Wang F. A systematic study of key elements underlying molecular property prediction. Nat Commun. 2023;14(1):6395.

Xia J, Zhang L, Zhu X, Liu Y, Gao Z, Hu B, et al. Understanding the limitations of deep models for molecular property prediction: Insights and solutions. Adv Neural Inf Process Syst. 2023;36:64774–92.

Google Scholar

Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations. France: ICLR; 2017.

Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph Attention Networks. In: International Conference on Learning Representations. Vancouver, Canada: ICLR; 2018.

Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE. Neural message passing for quantum chemistry. In: International conference on machine learning. Sydney, NSW, Australia: PMLR; 2017. pp. 1263–72.

Li Q, Han Z, Wu XM. Deeper insights into graph convolutional networks for semi-supervised learning. In: Proceedings of the AAAI conference on artificial intelligence. Louisiana, USA: AAAI; 2018. vol. 32.

He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, USA: IEEE; 2016. pp. 770–8.

Rogers D, Hahn M. Extended-connectivity fingerprints. J Chem Inf Model. 2010;50(5):742–54.

CAS
PubMed

Google Scholar

Devlin J, Chang MW, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). Florence, Italy: ACL; 2019. pp. 4171–86.

He K, Chen X, Xie S, Li Y, Dollár P, Girshick R. Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. LA, USA: IEEE; 2022. pp. 16000–9.

Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale. In: International Conference on Learning Representations. Virtual Event: ICLR; 2021.

Kim W, Son B, Kim I. Vilt: Vision-and-language transformer without convolution or region supervision. In: International Conference on Machine Learning. Online: PMLR; 2021. pp. 5583–94.

Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, et al. Language models are few-shot learners. Adv Neural Inf Process Syst. 2020;33:1877–901.

Google Scholar

Radford A, Narasimhan K, Salimans T, Sutskever I, et al. Improving language understanding by generative pre-training. OpenAI blog. 2018. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf.

Hu Y, Bajorath J. Extending the activity cliff concept: structural categorization of activity cliffs and systematic identification of different types of cliffs in the ChEMBL database. J Chem Inf Model. 2012;52(7):1806–11.

CAS
PubMed

Google Scholar

Stumpfe D, Hu H, Bajorath J. Advances in exploring activity cliffs. J Comput Aided Mol Des. 2020;34:929–42.

CAS
PubMed
PubMed Central

Google Scholar

Chithrananda S, Grand G, Ramsundar B. ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction. 2020. Preprint at https://doi.org/10.48550/arXiv.2010.09885.

Rong Y, Bian Y, Xu T, Xie W, Wei Y, Huang W, et al. Self-supervised graph transformer on large-scale molecular data. Adv Neural Inf Process Syst. 2020;33:12559–71.

Google Scholar

Wang Y, Wang J, Cao Z, Barati Farimani A. Molecular contrastive learning of representations via graph neural networks. Nat Mach Intell. 2022;4(3):279–87.

Google Scholar

Hu W, Liu B, Gomes J, Zitnik M, Liang P, Pande V, et al. Strategies for Pre-training Graph Neural Networks. In: International Conference on Learning Representations. Virtual Event: ICLR; 2020.

Wu F, Qin H, Gao W, Li S, Coley CW, Li SZ, et al. InstructBio: A Large-scale Semi-supervised Learning Paradigm for Biochemical Problems. 2023. arXiv preprint arXiv:2304.03906.

Fang X, Liu L, Lei J, He D, Zhang S, Zhou J, et al. Geometry-enhanced molecular representation learning for property prediction. Nat Mach Intell. 2022;4(2):127–34.

Google Scholar

Stärk H, Beaini D, Corso G, Tossou P, Dallago C, Günnemann S, et al. 3d infomax improves gnns for molecular property prediction. In: International Conference on Machine Learning. Baltimore, Maryland, USA: PMLR; 2022. pp. 20479–502.

Liu S, Wang H, Liu W, Lasenby J, Guo H, Tang J. Pre-training Molecular Graph Representation with 3D Geometry. In: International Conference on Learning Representations. Virtual Event: ICLR; 2022.

Xiang H, Jin S, Liu X, Zeng X, Zeng L. Chemical structure-aware molecular image representation learning. Brief Bioinform. 2023;24(6): bbad404.

PubMed

Google Scholar

Zhang T. An introduction to support vector machines and other kernel-based learning methods. AI Mag. 2001;22(2):103.

CAS

Google Scholar

Breiman L. Bagging predictors. Mach Learn. 1996;24:123–40.

Google Scholar

Fix E, Hodges JL. Discriminatory analysis: nonparametric discrimination, consistency properties. Int Stat Rev. 1989;57(3):238–47.

Google Scholar

LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44.

CAS
PubMed

Google Scholar

Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001;29(5):1189–1232. https://doi.org/10.1214/aos/1013203451.

Article

Google Scholar

Kullback S, Leibler RA. On information and sufficiency. Ann Math Statist. 1951;22(1):79–86.

Google Scholar

Lee DH, et al. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on Challenges in Representation Learning, PMLR; Vol. 3, 2013, no. 2.

Torng W, Altman RB. Graph convolutional neural networks for predicting drug-target interactions. J Chem Inf Model. 2019;59(10):4131–49.

CAS
PubMed

Google Scholar

Sakai M, Nagayasu K, Shibui N, Andoh C, Takayama K, Shirakawa H, et al. Prediction of pharmacological activities from chemical structures with graph convolutional neural networks. Sci Rep. 2021;11(1):525.

CAS
PubMed
PubMed Central

Google Scholar

Li P, Wang J, Qiao Y, Chen H, Yu Y, Yao X, et al. An effective self-supervised framework for learning expressive molecular global representations to drug discovery. Brief Bioinform. 2021;22(6): bbab109.

PubMed

Google Scholar

Fay MP, Proschan MA. Wilcoxon-mann-whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules. Stat Surv. 2010;4: 1.

PubMed
PubMed Central

Google Scholar

Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision. Hawaii: IEEE; 2017. pp. 618–26.

Luo D, Cheng W, Xu D, Yu W, Zong B, Chen H, et al. Parameterized explainer for graph neural network. Adv Neural Inf Process Syst. 2020;33:19620–31.

Google Scholar

Wu Z, Wang J, Du H, Jiang D, Kang Y, Li D, et al. Chemistry-intuitive explanation of graph neural networks for molecular property prediction with substructure masking. Nat Commun. 2023;14(1):2585.

CAS
PubMed
PubMed Central

Google Scholar

Wu Z, Jiang D, Wang J, Hsieh CY, Cao D, Hou T. Mining toxicity information from large amounts of toxicity data. J Med Chem. 2021;64(10):6924–36.

CAS
PubMed

Google Scholar

Xu C, Cheng F, Chen L, Du Z, Li W, Liu G, et al. In silico prediction of chemical Ames mutagenicity. J Chem Inf Model. 2012;52(11):2840–7.

CAS
PubMed

Google Scholar

Polishchuk PG, Kuz’min VE, Artemenko AG, Muratov EN. Universal approach for structural interpretation of QSAR/QSPR models. Mol Inform. 2013;32(9–10):843–53.

CAS
PubMed

Google Scholar

Peng S, Hu P, Xiao YT, Lu W, Guo D, Hu S, et al. Single-cell analysis reveals EP4 as a target for restoring T-cell infiltration and sensitizing prostate cancer to immunotherapy. Clin Cancer Res. 2022;28(3):552–67.

CAS
PubMed

Google Scholar

Liu T, Lin Y, Wen X, Jorissen RN, Gilson MK. BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities. Nucleic Acids Res. 2007;35(Suppl_1):D198-201.

CAS
PubMed

Google Scholar

Nakao K, Murase A, Ohshiro H, Okumura T, Taniguchi K, Murata Y, et al. CJ-023,423, a novel, potent and selective prostaglandin EP4 receptor antagonist with antihyperalgesic properties. J Pharmacol Exp Ther. 2007;322(2):686–94.

CAS
PubMed

Google Scholar

He J, Lin X, Meng F, Zhao Y, Wang W, Zhang Y, et al. A novel small molecular prostaglandin receptor EP4 antagonist, L001, suppresses pancreatic cancer metastasis. Molecules. 2022;27(4):1209.

CAS
PubMed
PubMed Central

Google Scholar

Murase A, Okumura T, Sakakibara A, Tonai-Kachi H, Nakao K, Takada J. Effect of prostanoid EP4 receptor antagonist, CJ-042,794, in rat models of pain and inflammation. Eur J Pharmacol. 2008;580(1–2):116–21.

CAS
PubMed

Google Scholar

Blouin M, Han Y, Burch J, Farand J, Mellon C, Gaudreault M, et al. The discovery of 4-\(\{\)1-[(\(\{\)2, 5-dimethyl-4-[4-(trifluoromethyl) benzyl]-3-thienyl\(\}\) carbonyl) amino] cyclopropyl\(\}\) benzoic acid (MK-2894), a potent and selective prostaglandin E2 subtype 4 receptor antagonist. J Med Chem. 2010;53(5):2227–38.

Caselli G, Bonazzi A, Lanza M, Ferrari F, Maggioni D, Ferioli C, et al. Pharmacological characterisation of CR6086, a potent prostaglandin E 2 receptor 4 antagonist, as a new potential disease-modifying anti-rheumatic drug. Arthritis Res Ther. 2018;20:1–19.

Google Scholar

Kotani T, Takano H, Yoshida T, Hamasaki R, Kohanbash G, Takeda K, et al. Inhibition of PGE2/EP4 pathway by ONO-4578/BMS-986310, a novel EP4 antagonist, promotes T cell activation and myeloid cell differentiation to dendritic cells. Cancer Res. 2020;80(16_Supplement):4443.

Albu DI, Wang Z, Huang KC, Wu J, Twine N, Leacu S, et al. EP4 antagonism by E7046 diminishes myeloid immunosuppression and synergizes with Treg-reducing IL-2-diphtheria toxin fusion protein in restoring anti-tumor immunity. Oncoimmunology. 2017;6(8):e1338239.

PubMed
PubMed Central

Google Scholar

Jin Y, Liu Q, Chen P, Zhao S, Jiang W, Wang F, et al. A novel prostaglandin E receptor 4 (EP4) small molecule antagonist induces articular cartilage regeneration. Cell Discov. 2022;8(1):24.

CAS
PubMed
PubMed Central

Google Scholar

Das D, Qiao D, Liu Z, Xie L, Li Y, Wang J, et al. Discovery of novel, selective prostaglandin EP4 receptor antagonists with efficacy in cancer models. ACS Med Chem Lett. 2023;14(6):727–36.

CAS
PubMed
PubMed Central

Google Scholar

Degen J, Wegscheid-Gerlach C, Zaliani A, Rarey M. On the Art of Compiling and Using ‘Drug-Like’ Chemical Fragment Spaces. ChemMedChem. 2008;3(10):1503–7.

CAS
PubMed

Google Scholar

Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, et al. PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 2019;47(D1):D1102–9.

PubMed

Google Scholar

Chen C, Ye W, Zuo Y, Zheng C, Ong SP. Graph networks as a universal machine learning framework for molecules and crystals. Chem Mater. 2019;31(9):3564–72.

CAS

Google Scholar

Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 2012;25:1097–105.

Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.

CAS
PubMed

Google Scholar

Cheng Z, Xiang H, Ma P, Zeng L, Jin X, Yang X, et al. MaskMol: knowledge-guided molecular image pre-training framework for activity cliffs with Pixel Masking. Zenodo. 2025. https://doi.org/10.5281/zenodo.15834481.

MaskMol: knowledge-guided molecular image pre-training framework for activity cliffs with pixel masking | BMC Biology

Tags: