Desjardins General Insurance, Modelling and Research Department, Levis, Quebec, Canada
2022 - presentAjusto is a telematics program that collects data on the driving habits and behaviours using sensors in smartphones. the data is used to personalize car insurance premium. I develop new interesting aggregated variables based on telematic raw data. I create machine learning models to predict claims based on ratemaking prediction and aggregated telematic features. The gain in prediction and clients experience is estimated. We evaluate the pros and cons and we made recommendations for the ratemaking team. I also support other teams working on Ajusto in providing report based on data analysis.
Post-Doctorate, Department of Decision Sciences, HEC and UQAM, Montreal, Quebec, Canada
2021 - 2022Methylation is a process that modifies DNA CpG sites by the addition of a methyl group. This phenomenon is necessary for the body to function. Methylation is measured at all sites, but is subject to missing values. The aim is to impute the level of methylation on the missing sites; this is a high-dimensional imputation problem with covariates. We propose a method for predicting missing methylation levels from observed ones and covariates. The method captures methylation level correlation structures between sites and samples. The regression function linking methylation level to covariates is modeled by a linear combination of observed and latent factors (LMC). We assume that the effects of the factors are Gaussian processes. Predictions for missing data are obtained by equations conditional on observed data.
Post- Doctorat, BioSP, INRAE Avignon
2019 - 2021This post-doctorate is part of the ANR SMITID project. The aim of the project is to develop statistical methods for inferring infectious disease transmission from high-throughput sequencing data. The aim of the post-doc is to develop statistical methods for detecting the impact of environmental factors once the transmission tree has been inferred.
The aim is to predict the number of deaths per country over the medium to long term, using a mixture model based on other, more advanced countries. The methodology developed is summarized on the BioSp blog and described in detail in Soubeyrand S, et al. (2020). My collaboration enabled me, among other things, to use R code to improve the visualization of interactive graphs via plotly on the Shiny application dedicated to this research.
The aim is to predict when intensive care unit (ICU) beds will be fully occupied in the Vaucluse département. The method is based on the temporal evolution of ICU bed occupancy in other French departments. A summary of the results is available on the BioSP blog dedicated to the Covid-19 pandemic.
PhD, ICJ, Lyon1, École centrale de Lyon defended in october 2018
This thesis is part of the ANR PEPITO project for the transport industry. The project is in collaboration with industrialists (Valéo, Intes, InModelia) and other academics. The aim is to build efficient turbomachinery. The numerical code used by Valéo to simulate turbomachinery operation is too expensive and cannot be used directly to address the problem.
The algorithms developed use only the available data to construct isotropic kernel groups by group. The methods are based on combinatorics and clustering. The corresponding published article is available here.
The problem of robustness is taken into account with the creation of two mean/variance criteria based on Taylor development. The metamodel used is co-kriging with derivatives. The seven strategies developed follow a classical sequential scheme of learning plan enrichment. The choice of enrichment points is based on expected improvement criteria, clustering methods and a multi-objective genetic optimization algorithm (NSGA II). The published article is available here and the article in proceeding is available here.
Mission, Freelance, Fondasol
2019Preliminary study on the prediction quality of the kriging model on foundation measurements.
Collaboration with Pallandre, J-P. (Museum national d'histoire naturelle)
2018Study of the links between the shape of the auricular surface of the sacroiliac joint in felines and the selection of their prey, the type of bites inflicted and their body mass. Creation of an R-Shiny application. The published articles are available here and here.
Second year of a master's degree internship, ICJ, Lyon1, École Centrale de Lyon
2015This internship takes place in the same context as the thesis. Several preliminary studies have been carried out: comparison of metamodels (kriging, linear regression and generalized additive model), dimension reduction using sensitivity analysis, co-kriging method.
First year of a master's degree internship, United Water, Paramus, New Jersey
2014Analysis of a city's water consumption data to detect fraud and leaks. Data is transmitted by water meters automatically every minute (Big Data). The data needs to be updated, validated and analyzed. The methods developed are based on statistical tests, linear regression and analysis of variance.