Machine learning and graph-based techniques to predict long-term bacterial community structure

30/06/2023 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : Lorraine Research Laboratory in Computer Science a
Durée : 36 mois
Contact :
Date limite de publication : 2023-06-30

Contexte :
We propose a fully funded 3-years PhD position in computer science with application to biomolecule analysis. The proposed position is funded by the Lorraine Université d’Excellence (LUE) through a multidisciplinary project that involves 2 researchers in computer science and 4 researchers in microbiology. A PhD thesis in microbiology will be conducted in parallel.

Context and motivations

Bacteriocins are antimicrobial peptides of bacterial origin with a very high economic potential in the agri-food sector. They are used in biopreservation/biocontrol applications to fight against undesirable microorganisms in the agronomy and food industry. LIBio has recently developed a technology based on the selection of two strains of the lactic acid bacterium Carnobacterium maltaromaticum producing anti-Listeria monocytogenes bacteriocins [9]. These strains inhibit the growth of this pathogen in cheeses when added to the manufacturing milk to produce the antimicrobial agents in the cheese matrix. These remarkable properties have led to a patent [10] that has very recently been licensed to a ferment producer. However, like the vast majority of biopreservation technologies, the effect is at best bacteriostatic: there is little or no decay of the pathogenic bacteria, which can then be maintained at low concentrations in the food. The biopreservation technologies described in the literature are based on engineering approaches that do not take advantage of the properties of the microbial communities forming the microbiomes of food products. Yet microbiome engineering is among the 12 promising technologies that could transform food systems over the next decade [11]. Indeed, in the case of biopreservation, assemblies of microorganisms could allow obtaining communities producing multiple antimicrobial agents and moreover being able to occupy the ecological niche of the undesirable microorganism to exclude it more efficiently. However, knowledge in the field of microbial community engineering is insufficient to fully exploit their potential. Indeed, due to the complexity of microbial communities, there is no available method to predict microbial community structure based on the knowledge of the ecological properties of microorganisms. Moreover, assembling microorganisms whose properties is to produce antimicrobial agents is a major difficulty because these agents can lead to the mutual exclusion of the microorganisms producing them.


In a microbial ecosystem in which members produce antimicrobial substances like bacteriocins, three actors can be considered: the bacteriocin-producing microorganism (P) and the microorganisms sensitive (S) and resistant (R) to this bacteriocin. It was experimentally shown that in simple ecosystems mixing three such actors, all three actors are able to maintain equilibrium [12]. In these systems, S is more competitive than R because it does not pay the cost of resistance, and R is more competitive than P because it does not pay the cost of bacteriocin production. This cyclical relationship between P, S, and R is similar to that of the popular game « rock paper scissors » where no one player has an advantage over the other two: each player can overtake one player and each can be defeated by another. These simple experimental systems suggest that it is possible to implement engineering tools to predict the structure of complex communities based on the interactive properties of microorganisms. Thanks to the emergence of high throughput investigation methods, it is now possible to produce interaction data between large sets of microorganisms and thus reconstitute models of microorganism interaction networks. [6] [7] Lately, Ramia et al. [4] [5] built the interaction network corresponding to 73 Carnobacterium maltaromaticum strains. Like previously, the graph is sender-determined and also shows a highly nested structure [7], which means that it is different from a randomly built network with the same number of nodes and edges. The results also show that the competitive interaction network is very dense making C. maltaromaticum a very interesting model to develop community engineering approaches producing high performance antimicrobial substances cocktails for the fight against undesirable microorganisms. This project will use the data published in Ramia et al. [5] and will try to provide a rather computer science approach to the study of those interaction graph properties.

The originality of this project is that it will make it possible to integrate experimental variables describing the properties of interaction between microorganisms for the prediction of community structure which is not possible by existing methods.

Sujet :
Objectives of the thesis

The main goal of the thesis is to use advanced machine learning and graph-based approaches in order to predict the long-term community structure in microbiological ecosystems [3] [1]. Particularly, it aims at providing approaches to deduce diversity directly from the static, inner properties of the interaction graph the entities are involved in. The practical objectives of this interdisciplinary PhD project, which will be carried out in collaboration with researchers from the Laboratoire d’Ingénierie des Biomolécules (LIBio), are as follows:

to study existing research works on the analysis of interaction networks and long-term diversity prediction in bacteria.
to propose machine learning and graph-based approaches in order to learn models that are able to predict diversity based on the interaction graphs. In this context, regression methods could be used to learn the relation between the interaction graph properties and the diversity.
to study how graph embedding could help in predicting the level of development for each strain. In this context, we aim to study the impact of graph embedding methods on the prediction results. A specific embedding method could be proposed in the context of this project.

Profil du candidat :
Required qualification: Candidates must have a master degree in computer science. Good programming skills in a procedural language are essential. Experience of machine learning and graph mining is also desirable but not essential. A strong interest in bioinformatics would also be highly desirable.

Formation et compétences requises :
Required qualification: Candidates must have a master degree in computer science. Good programming skills in a procedural language are essential. Experience of machine learning and graph mining is also desirable but not essential. A strong interest in bioinformatics would also be highly desirable.

Adresse d’emploi :
Lorraine Research Laboratory in Computer Science and its Applications (LORIA), Nancy, France

Document attaché : 202305240952_PROJET-DE-THESE-LORIA-LUE.pdf