Graph-based learning from integrated multi-omics and multi-species data (data science/bioinformatics) IFP Energies nouvelles/CentraleSupélec

31/12/2018 – 01/01/2019 all-day

Annonce en lien avec l’Action/le Réseau : aucun

Laboratoire/Entreprise : IFP Energies nouvelles/CentraleSupélec
Durée : 3 ans
Contact :
Date limite de publication : 2018-12-31

Contexte :
Micro-organisms are studied here for their application to bio-based chemistry from renewable sources. Such organisms are driven by their genome expression, with very diverse mechanisms acting at various biological scales, sensitive to external conditions (nutrients, environment). The irruption of novel high-throughput experimental technologies provides complementary omics data and, therefore, a better capability for understanding for the studied biological systems. Innovative analysis methods are required for such highly integrated data. Their handling increasingly require advanced bioinformatics, data science and optimization tools to provide insights into the multi-level regulation mechanisms (Editorial: Multi-omic data integration).

Sujet :
The main objective of this subject is to offer an improved understanding of the different regulation levels in the cell (from model organisms to Trichoderma reesei strains). The underlying prediction task requires the normalization and the integration of heterogeneous biological data (genomic, transcriptomic and epigenetic) from different microorganisms. The path chosen is that of graph modelling and network optimization techniques, allowing the combination of different natures of data, with the incorporation of biological a priori (in the line of BRANE Cut and BRANE Clust algorithms). Learning models relating genomic and transcriptomic data to epigenomic traits could be associated to network inference, source separation and clustering techniques to achieve this aim. The methodology would inherit from a wealth of techniques developed over graphs for scattered data, social networks. Attention will also be paid to novel evaluation metrics, as their standardization remains a crucial stake in bioinformatics. A preliminary internship position (summer/fall 2018) is suggested before engaging the PhD program. Information at:

**A. Pirayre, C. Couprie, L. Duval, F. Bidard, J.-C. Pesquet, BRANE Cut: biologically-related a priori network enhancement with graph cuts for gene regulatory network inference, 2015, BMC Bioinformatics
**A. Pirayre, C. Couprie, L. Duval, J.-C. Pesquet, BRANE Clust: Cluster-Assisted Gene Regulatory Network Inference Refinement, 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics
**D. Seux, F. D. Malliaros, A. Papadopoulos, M. Vazirgiannis, 2017, Core Decomposition of Uncertain Graphs Using Representative Instances, International Conference on Complex Networks and Their Applications

Profil du candidat :
Engineering school, Master of Science in data science/bioinformatics or related disciplines

Formation et compétences requises :
Bioinformatics, Data Science, Optimatization, Statistics, Applied Mathematics, Graph data processing, Gene network inference, Transcriptomics

Adresse d’emploi :
1 avenue de Bois-Préau, F-92852 Rueil-Malmaison, France

Document attaché : IFPEN-Centrale-Supelec-PhD-graph-learning-omics-bioinformatics-data-science.pdf