PhD projects

Carlos Martin
PhD in Strategic Innovation for Sustainable and Smart Ecosystems (Università di Milano-Bicocca)
Period: November 2024 – October 2027
Title: Innovative solutions for the monitoring of ecosystems based on hyperspectral imaging and machine learning
The aim of this PhD project is to predict the grassland management through the use of chemometrics and hyperspectral images. This is especially important because of the ecological value of these ecosystems. Multi/hyperspectral images from the European Space Agency (ESA) and drones will be used to monitor grassland. These data will be integrated with field data and other parameters such as air quality and meteorological data. All this information will then be used to develop robust chemometrics models, using both unsupervised and supervised techniques,which will be used to predict whether the grasslands have been mowed, used for livestock feed or not used at all, thus helping to know what is the state of use of these ecosystems.

Emmanuel Cruz
PhD in Chemical Sciences (Università di Milano-Bicocca)
Period: November 2021 – October 2024
Title: Geochemical study of pyrite persistence in the sedimentary records
This research aims to identify the chemical factors that lead to the finding of unstable minerals such as pyrite in Taiwan sediments to better understand the mechanisms that cause their alteration or preservation in the geological record. Several analytical methods will be used for the mineralogical characterization of the samples, being Raman the main one used to generate hyperspectral images to evaluate the degree of weathering of the samples. To validate this hyperspectral method, a method for the determination of weathering compounds on the surface of the mineral samples by HPLC-MS will be conducted. In addition, laboratory-scale weathering simulation experiments will be performed to understand which are the most significant factors, all these experiments supported by Chemometrics. Finally, the determination of metals and environmental organic markers in water from which the solid samples were collected will be carried out by ICP-OES, Ion Chromatography or TOC techniques, among others. The large amount of data generated will be processed using data fusion approaches and several multivariate methods.

Cecile Valsecchi
PhD in Chemical Sciences (Università di Milano-Bicocca)
Period: November 2018 – October 2021
Title: Advancing the prediction of Nuclear Receptor modulators through machine learning methods
Nuclear receptors are transcription factors involved in processes critical to human health and are a relevant target for toxicological risk assessment and the drug discovery process. Computational models can be a useful tool (i) to prioritize chemicals that can mimic natural hormones and thus be endocrine disruptors and (ii) to identify new possible lead for drug discovery. Therefore, the main goal of this project is to study potential interactions between chemicals and nuclear receptors, with the dual purpose of developing in silico tools to search for new modulators and to identify possible endocrine disrupting chemicals. After creating an exhaustive collection of nuclear receptor modulators, we applied machine learning methods to fill the data gap and prioritize modulators by building predictive models. In particular, modeling strategies included multi-tasking machine learning algorithms to investigate the complex relationships between chemicals and multiple nuclear receptors.
The full PhD thesis can be downloaded here

Giacomo Baccolo
PhD in Chemical Sciences (Università di Milano-Bicocca – University of Copenhagen)
Period: November 2018 – January 2022
Title: Chemometrics approaches for the automatic analysis of metabolomics GC-MS data
This thesis deals with the presentation of a new approach called AutoDise to extract meaningful chemical signals from GC-MS data. AutoDise is an expert system based on PARAFAC2 modelling, statistical diagnostics and Artificial Intelligence, which is able to take care of all the modeling aspects and to generate a peak table where each compound is univocally identified. Another important part of the thesis was devoted to the test and development of new artificial neural networks to be implemented in the AutoDise software for detecting which PARAFAC2 components are providing chemically useful information.
The full PhD thesis can be downloaded here

Francesca Grisoni
PhD in Environmental Sciences (Università di Milano-Bicocca)
Period: January 2013 – January 2016
Title: In silico assessment of aquatic bioaccumulation: advances from chemometrics and QSAR modelling
The doctoral dissertation addressed some of the current open problems in the prediction of aquatic bioaccumulation of organic chemicals, by means of Quantitative Structure-Activity Relationship (QSAR) and chemometrics. It aimed to advance the mechanistic knowledge about the bioaccumulation processes and to overcome some of the existing modelling gaps. Bioconcentration and dietary bioaccumulation were addressed separately, using fish as the target organism. Salient features of the developed QSAR approaches are simplicity and interpretability, which can allow for a widespread and transparent application, especially for regulatory purposes. Moreover, this work offers a theoretical basis for the hazard assessment of emerging contaminants.
The full PhD thesis can be downloaded here.

Matteo Cassotti
PhD in Chemical Sciences (Università di Milano-Bicocca)
Period: January 2012 – December 2014
Title: QSAR study of aquatic toxicity by chemometrics methods in the framework of REACH regulation
This study focuses on the toxicity exerted by chemicals to aquatic species. In fact, chemical substances released in the environment can eventually partition in water, exerting adverse effects on aquatic systems. REACH regulation requires information regarding the aquatic toxicity for all the substances subject to registration. In this study, chemometric methods will be applied in order to develop QSAR models that can be used to predict the aquatic toxicity to different aquatic species of substances for which this information is lacking. Several different classes of molecular descriptors (e.g. topological descriptors, fingerprints, structural keys, etc), variable selection methods (Genetic Algorithms, Particle Swarm Optimization, LASSO, etc) and modelling techniques (such as OLS and PLS regression, k-NN, SVM, etc) will be used in order to establish structure-activity relationships.
The full PhD thesis can be downloaded here.

Faizan Sahigara
PhD in Environmental Science (Università di Milano-Bicocca)
Period: September 2010 – September 2013
Title: Tools for prediction of environmental properties of chemicals by QSAR / QSPR within REACH
Faizan’s major responsibilities within the Marie Curie Initial Training Network – Environmental Chemoinformatics:
a) Development of new strategies to evaluate both the Congenericity principle and the Applicability Domain of QSAR models.
b) Assess using Applicability Domain, whether the proposed QSAR models are suitable for reliable predictions of the modeled property for a new chemical.
The full PhD thesis can be downloaded here.

Kamel Mansouri
PhD in Environmental Science (Università di Milano-Bicocca)
Period: September 2010 – September 2013
Title: New molecular descriptors for estimating degradation and environmental fate of organic pollutants by QSAR/QSPR models within REACH
Degradation and environmental fate of organic pollutants have been investigated experimentally during the last decades by use of various methods of trace analysis. The aim of this project is to use the data-mining tools to gather quality data and build QSAR models with high reliability for optimal estimation of environmental endpoints for REACH. New molecular descriptors and feature selection techniques will be tested with special attention to the validation of the models and applicability domain definition.
The full PhD thesis can be downloaded here.

Andrea Mauri
PhD in Chemical science (Università di Milano-Bicocca)
Period: January 2004– January 2007
Title: Protein and peptide multivariate characterisation using a molecular descriptor based approach
The full PhD thesis can be downloaded here.

Davide Ballabio
PhD in Food Biotechnology (Università di Milano)
Period: January 2003 – January 2006
Title: Chemometric characterisation of physical-chemical fingerprints of food products
This PhD project was focused on the the analysis of complex data derived from analytical tecniques on food products by means of multivariate statistical approaches. During the PhD thesis, classical and new chemometric methods were applied on several multivariate datasets.
The full PhD thesis can be downloaded here.

Manuela Pavan
PhD in Environmental science (Università di Milano-Bicocca)
Title: Total and partial ranking methods in chemical sciences
The full PhD thesis can be downloaded here.

Viviana Consonni
PhD in Chemical sciences (Università di Siena)
Title: Chemometric Methods for Environmental Data Analysis