This file contains the MATLAB code and data to reproduce the QSAR models proposed in the following manuscript: F. Grisoni, V. Consonni, D. Ballabio, (2019) Machine Learning Consensus to Predict the Binding to the Androgen Receptor within the CoMPARA project, Journal of chemical information and modeling, 59, 1839-1848 [link]. The CoMPARA project targeted the development of QSAR models to identify binders to the Androgen Receptor in the context of the Collaborative Modeling Project of Androgen Receptor Activity (CoMPARA), coordinated by the National Center of Computational Toxicology, at the U.S. Environmental Protection Agency. The collaborative project involved 35 international research groups to prioritize the experimental tests of approximatively 40k compounds, by merging the predictions provided by each participant. The applied machine learning methods are (i) multivariate Bernoulli Naïve Bayes, (ii) Random Forest and (iii) N-Nearest Neighbor classification. The approach was developed on 1687 molecules and further validated on a set of 3,882 external compounds in compliance with OECD principles.
Conditions and warranty
The dataset is freeware and may be used if proper reference is given to the authors. Please, refer to the following paper:
F. Grisoni, V. Consonni, D. Ballabio, (2019) Machine Learning Consensus to Predict the Binding to the Androgen Receptor within the CoMPARA project, Journal of chemical information and modeling, 59, 1839-1848 [link]
Download
Fill in the following form. Your personal data will be used only for notification via email of new releases of the dataset and will not be communicated to external third parties. Once the form has been submitted, open the rar file and extract the files. Have a look to the readme.txt file for further details. If you experience any problem to downlaod the toolbox, write to davide.ballabio@unimib.it.