MaDICS

Huitième édition du Symposium MaDICS (les inscriptions sont ouvertes !)

Ce rendez-vous annuel rassemble la communauté MaDICS afin de mettre en lumière les avancées récentes en sciences des données, à travers un programme scientifique riche comprenant des conférences invitées (keynotes), des ateliers thématiques, des tables rondes et des sessions de posters.
Ces temps forts favorisent des échanges scientifiques à la fois stimulants et conviviaux.

Une Session Poster sera spécialement consacrée aux jeunes chercheuses et jeunes chercheurs souhaitant présenter leurs travaux en analyse et gestion de données et dans les domaines interdisciplinaires autour de la Science des Données. Cette session sera également l’occasion d’échanger avec des collègues académiques et des acteurs industriels sur les thématiques de recherche présentées.

Dates importantes :

Soumission de posters : au plus tard le ~~23 mars 2026~~ 2 avril 2026
Retour : 9 avril 2026
Date limite d’inscription : 30 avril 2026
Symposium : les 2 et 3 juin 2026 à Avignon

Nous vous invitons d’ores et déjà à réserver ces dates dans votre agenda et à vous inscrire !
Inscrivez-vous ici

Pour en savoir plus…

MaDICS est un Groupement de Recherche (GDR) du CNRS créé en 2015. Il propose un écosystème pour promouvoir et animer des activités de recherche interdisciplinaires en Sciences des Données. Il est un forum d’échanges et d’accompagnement pour les acteurs scientifiques et non-scientifiques (industriels, médiatiques, culturels,…) confrontés aux problèmes du Big Data et des Sciences des données.
Pour en savoir plus…

Les activités de MaDICS sont structurées à travers des Actions et Ateliers. Les Actions rassemblent les acteurs d’une thématique précise pendant une durée limitée (entre deux et quatre ans). La création d’une Action est précédée par un ou plusieurs Ateliers qui permettent de consolider les thématiques et les objectifs de l’action à venir.

Le site de MaDICS propose plusieurs outils de support et de communication ouverts à la communauté concernée par les Sciences des Données:

Manifestations MaDICS : Le GDR MaDICS labellise des Manifestations comme des conférences, workshops ou écoles d’été. Toute demande de labellisation est évaluée par le Comité de Direction du GDR. Une labellisation rend possible un soutien financier pour les jeunes chercheuses et chercheurs. Une labellisation peut aussi être accompagnée d’une demande de soutien financier pour des missions d’intervenants ou de participants à la manifestation.
Pour en savoir plus…
Réseaux MaDICS : pour mieux cibler les activités d’animation de la recherche liées à la formation et à l’innovation, le GDR MaDICS a mis en place un Réseau Formation destiné à divers publics (jeunes chercheurs, formation continue,…), un Réseau Innovation pour faciliter et intensifier la diffusion des recherches en Big Data, Sciences des Données aux acteurs industriels et un Club de Partenaires qui soutiennent et participent aux activités du GDR.
Pour en savoir plus…
Espace des Doctorants : Les doctorants et les jeunes chercheurs représentent un moteur essentiel de la recherche et le GDR propose des aides à la mobilité et pour la participation à des manifestations MaDICS.
Pour en savoir plus…
Outils de communication : Le site MaDICS permet de diffuser des informations diverses (évènements, offres d’emplois, proposition de thèses, …) liées aux thématiques de recherche du GDR. Ces informations sont envoyées à tous les abonnés de la liste de diffusion MaDICS et publiés dans un Calendrier public (évènements) et une page d’offres d’emplois.

Adhésion au GDR MaDICS : L’adhésion au GDR MaDICS est gratuite pour les membres des laboratoires ou des établissements de recherche publics. Les autres personnes peuvent adhérer au nom de l’entreprise ou à titre individuel en payant une cotisation annuelle.
Pour en savoir plus…

Manifestations à venir

Journées Ecoles Conférences et Séminaires

Actions, Ateliers et Groupes de Travail :

CODA DAE DatAstro DSChem EXMIA GINO GRASP RECAST SaD-2HN SIMDAC SimpleText TIDS

Apr

Sat

2026

Robust-to-noise information extraction, unifying challenges of optical character recognition (OCR) and automatic speech recognition (ASR)

Tickets

Apr 4 – Apr 5 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : La Rochelle Université – Laboratoire l3i
Durée : 36
Contact : mickael.coustaty@univ-lr.fr
Date limite de publication : 2026-04-04

Contexte :
The growing digitization of written and oral content has made Optical Character Recognition (OCR) and Automatic Speech Recognition (ASR) essential in cultural heritage preservation, media accessibility, legal documentation, knowledge management and information retrieval. However, the outputs generated by these systems are inherently noisy: OCR is affected by document degradation, layout complexity or poor scanning quality, while ASR suffers from background noise, overlapping speech or non-standard oral expressions. Despite significant progress, it remains pervasive, and imperfections directly impact natural language downstream tasks where data quality is a key prerequisite. Although OCR and ASR face many similar error phenomena, their correction has mostly been studied in isolation, resulting in a lack of unified methodologies.

Sujet :
Objectives:
• Compare and analyse existing post-correction methods in OCR and ASR and potential for cross-domain adaptation.
• Develop unified approaches for post-correction that leverage the shared error patterns between OCR and ASR.
• Enable robust information extraction from noisy OCR and ASR outputs by designing strategies that mitigate the propagation of recognition errors into downstream NLP tasks.
Scientific challenges:
• Heterogeneity of noise sources: OCR errors are generated from visual artifacts while ASR errors are acoustic, a unified framework must generalize across modalities.
• Domain adaptation: OCR/ASR models often struggle on domain-specific datasets (e.g., historical texts, administrative documents, technical reports, scientific papers…) requiring correction methods that adapt to varying contexts.
• Complex error structures: beyond character and subword substitution, OCR/ASR introduce higher-level disruptions (mis-segmentation, overlapping text blocs or speech, layout misinterpretation) that complicate correction.
• Evaluation difficulties: classical metrics such as Character Error Rate (CER) or Word Error Rate
(WER) fail to fully capture the impact of errors on downstream information extraction, that
necessitate new evaluation methods.
• Scalability: correction methods must be applicable to large-scale corpora and adaptable to new
data without full retraining.
To tackle these challenges, the thesis will explore a combination of:
• Comparative state-of-the-art analysis: systematic benchmarking of existing OCR and ASR
post-correction methods on heterogeneous corpora.
• Unified modeling approaches: leveraging neural architectures (e.g., sequence-to-sequence
models, transformers, multilingual pre-trained LLMs) that can learn correction patterns across
both modalities.
• Hybrid methods: integrating symbolic rules, edit distance algorithms, and domain-specific
lexicons with machine learning models to improve robustness.
• Error modeling and simulation: designing artificial noise injection techniques to train models on
synthetic but realistic OCR/ASR-like errors, thus improving generalization.
• Evaluation frameworks: extending standard CER/WER with task-oriented metrics reflecting the
quality of downstream information extraction and retrieval.
This thesis helps to overcome the current limitations of automatic correction of texts produced by OCR
and ASR systems by proposing a unified approach, which represents a significant scientific advance. In
fact, in-depth analysis of the similarities and differences between OCR and ASR errors will provide a
better understanding of how these two fields can intersect. This project will enable the development of
more robust methods based on multidisciplinary knowledge from natural language processing, signal
processing, and image processing. The expected results will thus offer new perspectives in the
development and use of multimodal language models, contributing to the evolution of generative AI in
both language processing and signal processing. With the rise of multimodal databases (text, image,
audio, video), this thesis could inspire the creation of tools capable of simultaneously exploiting data
from various sources to extract more relevant information. The thesis is expected to deliver a
contribution to the bridging of OCR and ASR research communities and opening new research avenues
in multimodal NLP.

Profil du candidat :
The highly motivated candidate should hold a master’s degree in computer science or a related field. She/he should have
a strong background in NLP with an interest in text processing and multimodal data (text, speech,
document images). Familiarity with generative AI methods (e.g., large language models, text-to-text
generation, deep learning, fine-tuning strategies) will form a strong asset.

Formation et compétences requises :
Master of science in computer science, ai or applied mathemics, or any equivalent diploma

Adresse d’emploi :
mickael.coustaty@univ-lr.fr

Document attaché : 202603191652_Alloc_Doc_AI_DH_Coustaty_Suire_Public.pdf

Categories: theses

Apr

Fri

2026

Nested Modeling and Sensitivity Analysis for Robust Control of Large-Scale Membrane Bioreactors under Uncertainty

Tickets

Apr 10 – Apr 11 all-day

Offre en lien avec l’Action/le Réseau : EXMIA/Doctorants

Laboratoire/Entreprise : Laboratoire de Génie Chimique (LGC)
Durée : 3 ans
Contact : rachid.ouaret@toulouse-inp.fr
Date limite de publication : 2026-04-10

Contexte :
In France, more than 22,000 wastewater treatment plants collect and treat approximately 8.4 billion
m3 of wastewater annually, but less than 1% of this treated water is currently reused. In response
to recent drought episodes and the increasing scarcity of water resources, the “Water plan (Plan
Eau)” was introduced in 2023 to promote a more resilient and concerted management of water
resources, including the valorization of treated wastewater reuse (REUT) [1]. Membrane bioreactors (MBRs) are recognized wastewater treatment processes known for their excellent purification
performance and are now deployed in large-scale installations. However, these systems consume
at least twice the energy of conventional activated sludge systems (CAS), primarily due to their
high aeration requirements [3, 4]. This energy demand is further influenced by the concentration
of Total Suspended Solids (TSS), which directly affects reactor volume and design. Addressing
these issues requires advanced mathematical models capable of optimizing MBR performance while
accounting for uncertainties in influent characteristics, operational conditions, effluent quality, and
energy efficiency.
The scientific complexity of this issue lies in the intrinsically nested nature of MBR models. Unlike
traditional approaches that consider systems as “single-block” entities, large-scale MBRs involve
a deeply nested structure where the output of one sub-model (biological, physical, or energetic)
becomes the input of another [2]. This specific architecture generates complex multi-scale and
multi-frequency dynamics. Biological phenomena directly influence the production of suspended
solids and EPS (Extracellular Polymeric Substances), which condition membrane fouling, creating
a strongly coupled physico-biological system with multiple feedback loops. This nested structure
is essential for faithfully capturing the real behavior of industrial installations, but it considerably
complicates the analysis and management of uncertainties.
The uncertainties present in these systems are multiple and dependent: they concern the influent
(flow rate, COD, NH+
4
, TSS), operational conditions (control parameters), and membrane fouling.
These uncertainties evolve over time, are described by heterogeneous data often asynchronous or
incomplete, and exhibit complex structural dependencies that cannot be captured by classical statistical methods. In particular, dependencies between variables (linear or nonlinear, symmetric or
asymmetric) require specific tools such as copulas for precise modeling [5]. This complexity of uncertainties is exacerbated by the nature of the collected data: wastewater treatment plants generate
time series at different temporal scales (from seconds to weeks), with distinct time steps between
online sensors, laboratory analyses, and operational histories.
Sensitivity analysis represents an essential lever for understanding and optimizing these systems, but
it faces major challenges in the context of nested models. Traditional sensitivity analysis methods,
developed for single-block models, do not allow for fine analysis of sub-model contributions nor
correct quantification of sensitivities in a nested structure [6]. A specific approach is therefore
necessary to evaluate how variations in input parameters affect final outputs through the entire
modeling chain. This sensitivity analysis must also integrate symbolic data representation (intervals,
histograms, empirical distributions) to capture uncertainty without resorting to temporal smoothing
that would lose critical information [7].

Sujet :
The heart of this PhD lies in the development of an innovative methodology articulating nested
modeling, sensitivity analysis, and artificial intelligence for large-scale membrane bioreactors. This
approach builds upon preliminary work conducted at LGC on integrated MBR modeling [8, 9] and
leverages promising results recently obtained in the field of sensitivity analysis of hybrid models
[10, 11].
Nested modeling constitutes the foundation of this PhD. Unlike traditional single-block approaches, we will develop a graphical representation of interdependencies between biological (ASM-SMP),
physical (RIS), and energetic sub-models, following an approach similar to that proposed by Touboul
[12]. This nested structure will enable faithful capture of multiple feedback loops between different MBR system components, particularly between biological pollutant degradation processes and
physical membrane filtration phenomena. The use of probabilistic tools such as copulas [13] will
allow rigorous modeling of stochastic dependencies between influential variables, while a variant of
the Total Interaction Index (TII) [14] will be implemented to assess stochastic dependencies within
the nested framework.
Sensitivity analysis occupies a central position in this PhD, with the objective of developing sensitivity measures specifically adapted to nested and multi-scale structures. We will deploy classical
tools well-documented in specialized literature (derivative-based, distribution-based, or variogrambased approaches) [15], while conducting an in-depth reflection on the specificity of nested model
structures to tailor these tools to our specific context. The Shapley effect for sensitivity analysis
with dependent inputs [16, 13] will also be considered within the framework of this project, offering an innovative perspective for quantifying individual and joint contributions of parameters in a
dependency context.
Artificial intelligence will play a complementary and essential role in this PhD, particularly through
the development of Physics-Informed Neural Networks (PINN) and machine learning model interpretability techniques. This hybrid approach, which combines the advantages of mechanistic models
and data-driven methods, has already proven its worth in the field of process engineering [18] and
specifically for wastewater treatment [19]. Recent work by Danesh et al. [17, 11] has demonstrated
the effectiveness of model-agnostic methods (Accumulated Local Effects, Partial Dependence Plots)
for making neural network predictions interpretable, a fundamental requirement for the adoption
of these tools by industrial operators.
Reinforcement learning (RL) will constitute the third pillar of this PhD, with the objective of developing a real-time control framework for optimizing MBR operations. Building upon the nested
models developed and sensitivity analyses performed, this RL framework will dynamically adjust
operational parameters based on process variability. State estimation techniques, such as the Extended Kalman Filter (EKF), will be implemented to enhance RL decision-making by mitigating
measurement noise and handling system uncertainties [20, 21]. The integration of RL with mechanistic models will ensure that the control strategy remains explainable and applicable to real-world
operations

Profil du candidat :
Education: Master’s degree or equivalent (5 years) in applied mathematics, statistics, process engineering, control engineering, data science, or related field.

Formation et compétences requises :
Technical skills:
– Probability and statistics (sensitivity analysis, stochastic processes)
– Dynamical systems modeling and differential equations
– Optimization and optimal control
– Machine learning and reinforcement learning
– Scientific programming (Python, R, MATLAB, Julia)
• Assets: Knowledge in process engineering or water treatment would be a plus
• Personal qualities: Autonomy, scientific rigor, taste for interdisciplinary research
• Languages: Scientific English (reading, writing, oral communication)

Adresse d’emploi :
4 allée Emile Monso
CS 84234
31 432 Toulouse cedex 4

Document attaché : 202603101043_ANR_FlexMIEE_Offre_these_fr_en.pdf

Categories: theses

Apr

Wed

2026

Call for PhD Applications 14 Prestigious Marie Skłodowska-Curie Actions Double Degree Doctorate Fellowships GreenFieldData : IoRT Data Management and Analysis for Sustainable Agriculture Project 3-year contract starting September/October 2026

Tickets

Apr 15 – Apr 16 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : INRAE and other 11 EU universities
Durée : 36 months
Contact : sandro.bimonte@inrae.fr
Date limite de publication : 2026-04-15

Contexte :
Call for PhD Applications
14 Prestigious Marie Skłodowska-Curie Actions
Double Degree Doctorate Fellowships

GreenFieldData : IoRT Data Management and Analysis for Sustainable Agriculture Project
3-year contract starting September/October 2026

*****
https://www.eu4greenfielddata.eu/
*****

***Are you an aspiring researcher ready to drive the digital and green transition in agriculture?
The GreenFieldData project offers an outstanding opportunity to pursue a PhD within a high-calibre international and interdisciplinary network, funded under the prestigious Marie Skłodowska-Curie Actions (MSCA) Doctoral Networks (Grant agreement ID: 101226371).

***Why join us ?
Pillar of Excellence: A High-Level MSCA Joint Doctorate
The “IoRT Data management and analysis for Sustainable Agriculture” (GreenFieldData) project is an initiative of Pillar 1 (Excellence) of Horizon Europe. This ambitious network unites 12 leading academic beneficiaries across 7 EU countries, supported by 24 associated non-academic and academic partners.
By joining this network, you will become part of a highly integrated, inter-sectoral, and international (triple ‘i’) training environment. Our common goal is to train a new generation of researchers who can provide robust and human-centric solutions to the challenges posed by climate change and socio-economic constraints.

***Exceptional Financial Support for Your PhD
The MSCA Joint Doctorate provides a highly competitive and financially attractive employment package for the entire 36-month duration of the PhD contract:
Generous Living Allowance: A monthly gross salary contribution
Mobility Allowance: An additional monthly contribution to cover private mobility-related costs (e.g., relocation, travel)
Family Allowance: A monthly allowance is also provided, if applicable (researchers with family obligations)

***Double Degree, High-Level Training, and Employability
All 14 Doctoral Candidates will be enrolled in Double Degree Doctorate programmes, guaranteeing joint supervision from at least two prominent international universities, with secondments in industrial partners
The project offers a high-level doctoral training programme, providing a unique toolbox of cutting-edge knowledge and transferable skills essential for maximizing your future employability within research, digital technologies, and agricultural sectors.

Sujet :
***14 Cutting-Edge Research Topics on IoRT and Sustainable Agriculture
Your research will focus on the convergence of advanced data science and IoRT (Internet of Robotic Things) to foster Sustainable Agriculture and define efficient low input practices. We are seeking bright minds to tackle advanced topics such as Advanced Database Management Systems, AI, Edge-Fog-Cloud Architectures, and Data Analysis applied to real-world agricultural challenges.

We are recruiting for the following 14 PhD Positions (3-year contract starting September/October 2026):

Position A
Optimized IoRT network for enhanced data quality of IoRT cereals production practices.
Aarhus University (DK) & Clermont Auvergne University (FR)

Position B
Data collection and analysis empowered with AI for robotized Olive Oil Precision Farming.
University College Dublin (IE) & Instituto Superior Técnico (PT)

Position C
Powering data-driven sustainability assessment tasks in agri-food systems with IoT-data Datlakes and Large Language Models.
Aarhus University (DK) & Université Libre Bruxelles (BE)

Position D
Human-centric Digital twins for monitoring robotized biostimulants application practices.
University Milan (IT) & Université Libre Bruxelles (BE)

Position E
Optimizing Images Quality and Deep Learning Methods for Vineyard Disease Detection.
University Padova (IT) & Poznan University of Technology (PL)

Position F
Optimized Olive crop irrigation based on high quality soil data using IoRT networks.
Instituto Superior Técnico (PT) & University Toulouse (FR)

Position G
Characterization of abiotic stress of trees using AI methods on acoustic signals.
University College Dublin (IE) & INRAE (FR)

Position H
Monitoring of grazing animals using sensors and data science.
University Liege (BE) & University College Dublin (IE)

Position I
Assessing soil and crop health across sugar-beet producing farms.
Poznan University of Technology (PL) & University Liege (BE)

Position L
Natural language based interaction for robotized biostimulant practices.
CNRS (FR) & University Milan (IT)

Position M
Assessing drought effects on grassland using IoT-enabled visual sensors.
Poznan University of Technology(PL) & INRAE (FR)

Position N
Optimization-simulation coupling for the GHG emission based supervision and planification of a fleet of autonomous agricultural robots.
Aarhus University (DK) & INRAE (FR)

Position O
Adaptive navigation for agricultural robots using database-driven insights.
Université Libre Bruxelles (BE) & INRAE (FR)

Position P
Agricultural AI data integration and management based on LLM.
University Toulouse (FR) & University Padova (IT)

Timeline
Application Open : January 5th 2026
Application Deadline : April 15th 2026
Selection Process : Mai 2026
PhD Start Date : September-October 2026

Profil du candidat :

Formation et compétences requises :

Adresse d’emploi :
7 EU countries
All information here
https://www.eu4greenfielddata.eu/

Categories: theses

Apr

Fri

2026

Détection et Réparation d’Incohérences de Données via les Techniques de Machine Learning dans un Environnement Incertain

Tickets

Apr 17 – Apr 18 all-day

Offre en lien avec l’Action/le Réseau : – — –/Doctorants

Laboratoire/Entreprise : LIAS/ENSMA
Durée : 3 ans
Contact : allel.hadjali@ensma.fr
Date limite de publication : 2026-04-17

Contexte :

Sujet :
Voir en attaché la description du sujet.

Profil du candidat :
1. Être titulaire d’un diplôme de niveau Bac +5 en informatique (ou en mathématiques appliquées) avec un intérêt pour la recherche.

2. Posséder une expertise en Machine Learning (une expérience/connaissance sur la gestion de données incertaines ou/et en recherche opérationnelle est un plus).

3. Avoir des compétences analytiques avancées et une capacité à résoudre des problèmes complexes.

4. Posséder une aptitude à communiquer à l’oral et à l’écrit en français et en anglais.

Formation et compétences requises :

Adresse d’emploi :
Laboratoire d’Informatique et d’Automatique pour les Systèmes
Ecole Nationale Supérieure de Mécanique et d’Aérotechnique (Poitiers)
Téléport 2 – 1 Avenue Clément Ader – BP 40109
86961 FUTUROSCOPE CHASSENEUIL Cedex – FRANCE

Document attaché : 202601261037_Sujet_These_Loic-Allel.pdf

Categories: theses

Apr

Thu

2026

3 thèses à pourvoir en IA et télédétection (Vannes, France et Ispra, Italie)

Tickets

Apr 30 – May 1 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : IRISA Vannes (équipe OBELIX) et European Commissio
Durée : 36 mois
Contact : sebastien.lefevre@irisa.fr
Date limite de publication : 2026-04-30

Contexte :

Sujet :
Nous proposons trois thèses dans le domaine de la vision par ordinateur appliquée à l’observation de la terre avec des applications en soutien aux politiques européennes. Elles seront conduites au sein de l’équipe OBELIX de l’IRISA à Vannes (Bretagne), en partenariat avec le Centre de Recherche Commun de la Commission Européenne (Ispra, Italie), et le soutien du cluster IA SequoIA. Les thèses se dérouleront en Italie pour 2026 et 2027, et en France pour 2028, 2029.

1) Global multi-task learning for mapping and characterizing human settlements from EO data (lien pour plus d’infos et candidater: https://amethis.doctorat.org/amethis-client/prd/consulter/offre/2588)

2) Backcasting anthropogenic infrastructures over a century of historical EO data and maps (lien pour plus d’infos et candidater: https://amethis.doctorat.org/amethis-client/prd/consulter/offre/2591)

3) Explainable multimodal AI using geospatial data for rapid estimation of displacement and people in need in crises (lien pour plus d’infos et candidater: https://amethis.doctorat.org/amethis-client/prd/consulter/offre/2592)

Attention, des contraintes de nationalité sont imposées pour les trois sujets (plus de détails dans les descriptifs des sujets).

Date limite pour candidature: 15 janvier 2026 pour un démarrage à partir d’avril 2026.

Profil du candidat :

Formation et compétences requises :

Adresse d’emploi :
Ispra, Italie en 2026 et 2027
Vannes, France en 208 et 2029

Categories: theses

Offre de thèse en intelligence artificielle pour la gestion des ressources halieutiques

Tickets

Apr 30 – May 1 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : Laboratoire de Génie Informatique et d’Automatique
Durée : 36 mois
Contact : sebastien.ramel@univ-artois.fr
Date limite de publication : 2026-04-30

Contexte :
* TITRE

Quantification de l’incertitude prédictive, fondée sur la théorie de l’évidence, appliquée à l’estimation des traits de vie des poissons à partir d’images d’otolithes 3D

* THEMATIQUE

Intelligence Artificielle, Apprentissage Automatique, Science des Données

* MOTS CLES

Théorie de Dempster-Shafer, Quantification de l’incertitude, Traits de vie, Écosystèmes marins, Otolithe.

* DATE DE DEBUT ET DUREE

Septembre/Octobre 2026, 36 mois

* FINANCEMENT

50% IFSEA / 50% Université d’Artois (demandé)

* LOCALISATION

Les travaux seront menés en collaboration entre le Laboratoire de Génie Informatique et d’Automatique de l’Artois (LGI2A) à Béthune et le Laboratoire d’Informatique Signal et Image de la Côte d’Opale (LISIC) à Calais.

* ENCADREMENT

Directeur : Prof. Frédéric Pichon (frederic.pichon@univ-artois.fr), Université d’Artois, LGI2A
Co-directrice : Prof. Emilie Poisson Caillault (emilie.caillault@univ-littoral.fr), Université du Littoral Côte d’Opale, LISIC
Co-encadrant : Dr. Sébastien Ramel (sebastien.ramel@univ-artois.fr), Université d’Artois, LGI2A

Sujet :
La connaissance des traits de vie des poissons (habitat, âge, croissance, reproduction, longévité, position dans la colonne d’eau…) est un aspect essentiel pour une gestion efficace et durable des stocks de poissons marins. Les pièces calcifiées, et précisément les otolithes qui sont les seules pièces métaboliquement inertes, sont une source d’information précieuse à cette fin. Notamment, leur forme externe, caractérisée historiquement à partir d’images en 2D et plus récemment étudiée en 3D, permet de prédire de façon très précise ces différents traits de vie. Les images 3D, si elles sont plus informatives, sont néanmoins plus coûteuses et récentes et par conséquent moins nombreuses. Il convient donc d’utiliser au mieux cette source d’information riche mais restreinte, afin d’obtenir les prédictions les plus fiables et précises possibles. La théorie de l’évidence, aussi appelée théorie de Dempster-Shafer ou théorie des fonctions de croyance, est une généralisation du cadre probabiliste pour le raisonnement sous incertitudes. Son utilisation dans le cadre de la quantification des incertitudes dans des prédictions est particulièrement indiquée pour le cas où le nombre de données est faible. Ce projet de thèse vise ainsi à développer des méthodes prédictives fondées sur cette théorie et adaptées aux approches actuelles en matière de prédiction des traits de vie des poissons à partir d’images d’otolithes 3D. Étant donné la nature de ce type d’application, au niveau méthodologique, la prédiction de variables ordinales sera au centre du projet.

Plus de détails disponibles ici: https://www.lgi2a.univ-artois.fr/spip/fr/postes_ouverts/poste-ouvert-32

Profil du candidat :
La candidate ou le candidat devra être titulaire d’un master ou d’un titre d’ingénieur en informatique, mathématiques appliquées ou champ connexe. Des connaissances en intelligence artificielle (apprentissage automatique) et/ou en traitement de l’image seront un atout, ainsi qu’une sensibilisation aux méthodes de gestion de l’incertitude. Les qualités permettant de mener à terme un programme de doctorat telles que la curiosité, la créativité, l’autonomie, l’esprit critique et l’enthousiasme, seront nécessaires.

Formation et compétences requises :
Master ou d’un titre d’ingénieur en informatique, mathématiques appliquées ou champ connexe.

Adresse d’emploi :
LGI2A – Laboratoire de Génie Informatique et d’Automatique de l’Artois – UR 3926
Faculté des Sciences Appliquées
Technoparc Futura
62400 – BÉTHUNE Cedex
France

Categories: theses

May

Sat

2026

Fusion d’images SAR réelles et simulées pour une reconnaissance de cibles ultra-robuste par IA

Tickets

May 2 – May 3 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : ONERA-DEMR, UTT-LIST3N
Durée : 3 ans
Contact : alexandre.baussard@utt.fr
Date limite de publication : 2026-05-02

Contexte :

Sujet :
https://w3.onera.fr/formationparlarecherche/sites/w3.onera.fr.formationparlarecherche/files/phy-demr-2026-05.pdf

Profil du candidat :

Formation et compétences requises :

Adresse d’emploi :
ONERA, site Palaiseau

Categories: theses

Réseaux neuronaux basés sur la physique en imagerie par tomographie d’impédance électrique

Tickets

May 2 – May 3 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : CEA Cadarache / UTT-LIST3N
Durée : 3 ans
Contact : alexandre.baussard@utt.fr
Date limite de publication : 2026-05-02

Contexte :
Dans le cadre de l’utilisation durable de l’énergie nucléaire pour un mix énergétique décarboné en association avec les énergies renouvelables, les réacteurs de IVe génération à neutrons rapides sont cruciaux pour la fermeture du cycle du combustible et la maîtrise de la ressource en uranium. La maîtrise de la sûreté d’un tel réacteur à caloporteur sodium repose notamment sur la détection précoce de vides gazeux dans les circuits. Dans ces milieux opaques et métalliques, les méthodes d’imagerie optiques sont inopérantes, d’où la nécessité de développer des techniques innovantes.
Cette thèse s’inscrit dans le développement de la tomographie d’impédance électrique (EIT) appliquée aux métaux liquides, une approche non intrusive permettant d’imager la distribution de conductivité dans un écoulement.

Les réseaux neuronaux informés par la physique (PINN, pour Physical informed neural network) sont récemment apparus comme une technique d’apprentissage automatique prometteuse pour résoudre les équations différentielles partielles (EDP) en intégrant directement les lois physiques dans la fonction de perte. Ils ont déjà démontré leur potentiel dans la résolution de problèmes inverses pour de nombreuses applications. Il est possible de définir une fonction de perte intégrant uniquement les équations physiques mais aussi d’intégrer, en plus de la physique, des données (simulées, expérimentales ou réelles), sans donc faire de l’apprentissage purement guidé par les données (dit data-driven) comme avec des réseaux de neurones convolutionnels classiques.
Si les PINN ont déjà été utilisés en inversion, il existe cependant très peu de publications qui traitent de la résolution du problème inverse en tomographie d’impédance électrique. Ces dernières sont de plus très récentes et se limitent généralement à des géométries de reconstruction relativement simples et elles peuvent reposer sur des hypothèses assez restrictives pour des scénarios réels.
Ainsi différentes contributions pourront émerger de ce travail à la fois méthodologique sur les PINN mais aussi applicatives par l’exploitation de données expérimentales.

Sujet :
L’objectif de cette thèse est de développer un système complet de tomographie de résistivité électrique pour la détection et la cartographie en temps réel des écoulements diphasiques métal liquide/argon en vue de l’appliquer à des écoulements de circuits de Génération IV.

Des approches d’intelligence artificielle, notamment les réseaux neuronaux informés par la physique, seront explorées pour combiner apprentissage numérique et contraintes physiques. Elles seront comparées à l’utilisation de simulations numériques. L’objectif est d’établir des modèles physiques adaptés au contexte et de concevoir des méthodes d’inversion robustes vis-à-vis des bruits de mesure.

Le sujet s’articulera autour de quatre axes :
1. Lois physiques et modélisation (électromagnétique et hydrodynamique) des signaux de tomographie dans le sodium.
2. La reconstruction d’image à partir des mesures de conductivité, en 2D et en 3D spatial, avec tensions sinusoïdales. On se tournera pour cela vers des méthodes de Machine-Learning.
3. Développement expérimental : mesures avec du galinstan (de conductivité proche de celle su sodium liquide).
4. Amélioration de la reconstruction tomographique en présence de bruit de défauts des capteurs et de perturbations de fond.

Profil du candidat :
Etudiant(e) de niveau master ou ingénieur ayant suivi une formation en mathématique appliquée, en apprentissage machine (deep learning) ou en physique (électromagnétisme). Il est nécessaire de maîtriser Python et de connaître si possible PyTorch.
Le travail attendu nécessite rigueur, autonomie et un intérêt pour les sujets à la frontière de plusieurs disciplines.

Formation et compétences requises :

Adresse d’emploi :
CEA Cadarache

Contacts :
– Encadrant CEA : michel.frederic@cea.fr
– Directeur de thèse : alexandre.baussard@utt.fr

Categories: theses

May

Mon

2026

FUSION-KG: Framework for Unified multimodal Semantic extractION for Knowledge Graphs construction

Tickets

May 11 – May 12 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : ICube Strasbourg
Durée : 3 ans
Contact : franco.giustozzi@insa-strasbourg.fr
Date limite de publication : 2026-05-11

Contexte :
Environmental restoration projects generate large volumes of heterogeneous documentation, including technical reports, project plans, cartographic materials, engineering drawings, and photographic records. These materials contain valuable but fragmented knowledge describing intervention strategies, environmental contexts, technical constraints, and outcomes.
Within the TETRA project (ANR-22-FAI2-0006), previous research efforts primarily concentrated on text-based knowledge extraction using Large Language Models (LLMs), enabling the structuring of restoration knowledge from technical and narrative reports. While this approach demonstrated the potential of large language models for semantic modeling and ontology enrichment, it remained largely confined to textual sources. However, restoration documentation increasingly includes rich visual materials, such as maps, technical drawings, aerial imagery, and photographic records that contain complementary and sometimes critical information not explicitly described in text. This PhD builds upon the foundations established in TETRA by extending the extraction paradigm toward a unified multimodal framework. The central hypothesis is that integrating textual and visual understanding through advanced Vision-Language Models (VLMs) can substantially improve the completeness, semantic consistency, and interpretability of structured environmental knowledge graphs.

Sujet :
The FUSION-KG PhD aims to design a unified multimodal semantic extraction framework capable of transforming heterogeneous environmental documentation into structured, interpretable, and queryable knowledge graphs. The ambition is not only to extract information from text and images, but to develop a coherent framework in which multimodal understanding and structured external knowledge
jointly contribute to reliable and semantically consistent knowledge graph construction.
The work involves the systematic modeling and characterization of heterogeneous documentary sources, including technical reports, maps, engineering drawings, aerial and satellite imagery, and photographic
records of restoration interventions. These materials provide complementary yet often fragmented accounts of intervention types, spatial configurations, temporal phases, environmental parameters, constraints, and outcomes. A major challenge lies in ensuring that information extracted from visual and textual modalities is semantically aligned and represented within a shared conceptual framework.

Profil du candidat :
The doctoral contract is awarded by the doctoral school’s selection committee through a competitive process in which the candidates’ merit is a key factor

Formation et compétences requises :
Education: Student about to graduate a Master or Engineer (Bac + 5) with a specialization in Computer Science.

Specific knowledge: Knowledge on data science methods, knowledge representation and reasoning, knowledge graphs.
Languages: Python, java, owl/sparql.
Ability to work with experts who are not computer scientists. Interest in the application domain would be appreciated.

Adresse d’emploi :
ICube laboratory (CNRS UMR 7357),
300 boulevard Sebastien Brant
BP 10413
67412 ILLKIRCH cedex

Document attaché : 202603151916_Sujet_These_ED_VLM.pdf

Categories: theses

May

Sun

2026

Label-scarce VHR Disaster Mapping in the Era of Geospatial Foundation Models

Tickets

May 31 – Jun 1 all-day

Offre en lien avec l’Action/le Réseau : – — –/Doctorants

Laboratoire/Entreprise : IRISA-UBS
Durée : 3 ans
Contact : minh-tan.pham@irisa.fr
Date limite de publication : 2026-05-31

Contexte :

Sujet :
For more information, please visit: https://www-obelix.irisa.fr/files/2026/02/2026_PhD_Dreams.pdf

Profil du candidat :
MSc or Engineering degree with excellent academic track and proven research experience in one of the following fields: computer science, applied maths, signal and image processing;

Formation et compétences requises :

Adresse d’emploi :
IRISA-UBS, Vannes, 56000

Document attaché : 202602050215_2026_PhD_Dreams.pdf

Categories: theses

Jun

Tue

2026

Thèse en Intelligence Artificielle dans le cadre du projet ANR IARISQ (2026-2030)

Tickets

Jun 2 – Jun 3 all-day

Offre en lien avec l’Action/le Réseau : – — –/Innovation

Laboratoire/Entreprise : CRISTAL UMR CNRS 9189
Durée : 36 mois
Contact : hayfa.zgaya-biau@univ-lille.fr
Date limite de publication : 2026-06-02

Contexte :
Dans le cadre du projet ANR IARISQ : https://anr.fr/Project-ANR-25-CE56-3679 : “CONCEPTION ET DEVELOPPEMENT D’UN SYSTEME D’AIDE A LA DECISION A BASE D’INTELLIGENCE ARTIFICIELLE POUR LA PREDICTION DE LA QUALITE DE L’AIR ET LA DETERMINATION DES RISQUES SANITAIRES DES PARTICULES”, nous cherchons un doctorant pour la modélisation et prévision temporelle de la composition chimique des particules atmosphériques ; et la prédiction des seuils de toxicité associés, en intégrant ces variables physico-chimiques.

Sujet :
Prédiction temporelle de la composition physico-chimique des particules atmosphériques et estimation dynamique de leurs seuils de toxicité par Intelligence Artificielle

Profil du candidat :
Titulaire d’un Master en Intelligence Artificielle, avec une bonne maîtrise de l’anglais et de solides compétences en rédaction scientifique. Une expérience de publication (article soumis et/ou publié) constitue un atout.

Formation et compétences requises :
– Formation en informatique avec spécialisation en Intelligence Artificielle (Master ou équivalent)
– Excellentes compétences en développement informatique (Python et bibliothèques associées)
– Bonne maîtrise des approches d’IA symbolique et sub-symbolique
– Expérience en modélisation et en prédiction de séries temporelles

Adresse d’emploi :
UMR CRIStAL
Université de Lille – Campus scientifique
Bâtiment ESPRIT
Avenue Henri Poincaré
59655 Villeneuve d’Ascq

Document attaché : 202604020557_Projet ANR IARISQ Sujet de thèse.pdf

Categories: theses

Sep

Tue

2026

PHD position : Meta-Learning and Artificial General Intelligence for a Computational Theory of Assistance to Human Learning

Tickets

Sep 1 – Sep 2 all-day

Offre en lien avec l’Action/le Réseau : – — –/Doctorants

Laboratoire/Entreprise : LITIS-INSA Rouen
Durée : 3 ans
Contact : aomar.osmani@insa-rouen.fr
Date limite de publication : 2026-09-01

Contexte :
Thèse financée dans le cadre des allocations de recherche état/région.

Sujet :
Meta-Learning and Artificial General Intelligence for a
Computational Theory of Assistance to Human Learning

Profil du candidat :
Nous recherchons un(e) candidat(e) issu(e) d’un M2 ou diplôme d’ingénieur en informatique, data science, IA ou sciences cognitives computationnelles, en mathématiques avec une forte appétence pour
la recherche.

Compétences souhaitées :
— bases solides en ML/DL ;
— intérêt pour les sciences cognitives, les sciences de l’éducation, ou l’optimisation ;
— goût pour la modélisation mathématique et pour la modélisation et la programmation ;

— des connaissances en méta-apprentissage, RL, modèles séquentiels (RNN/Transformers) consti-
tuent un plus.

Environnement :
— Projet pluridisciplinaire (IA, sciences cognitives, ingénierie pédagogique) à fort impact sociétal ;
— ressources de calcul et données pour des expérimentations à grande échelle ;
— valorisation attendue dans des conférences internationales (NeurIPS, ICLR, AIED, etc.).

Formation et compétences requises :
ML/DL, programmation (Python), expérience PyTorch/TensorFlow appréciée ;

intérêt pour éducation/cognition ; méta-learning/RL/modèles séquentiels

Adresse d’emploi :
INSA de Rouen
685 Avenue de l’Université 76800 Saint-Etienne-du-Rouvray

Document attaché : 202602171414_sujetAnglais(1).pdf

Categories: theses

April – September 2026 Apr – Sep 2026

Masses de Données, Informations et Connaissances en Sciences

Big Data - Data Science

Présentation Générale

Manifestations à venir

Actions, Ateliers et Groupes de Travail :