MaDICS

PhD position on privacy, data generation and formal methods (IRISA, France)

Jun 30 – Jul 1 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : IRISA
Durée : 36 months
Contact : tristan.allard@irisa.fr
Date limite de publication : 2025-06-30

Contexte :
This PhD offer is funded by the PEPR Cybersecurity IPoP project (https://files.inria.fr/ipop/) and proposed by the Security and Privacy team (SPICY, https://www-spicy.irisa.fr/) from the IRISA institute (https://www.irisa.fr/en) in Rennes, France. The work will be supervised jointly by Tristan Allard (PhD, HDR, https://people.irisa.fr/Tristan.Allard/) associate professor at the University of Rennes, expert in privacy in data intensive systems, and Barbara FILA (PhD, HDR, http://people.irisa.fr/Barbara.Fila/), associate professor at INSA Rennes, expert in formal methods for risk assessment.

The successful candidate will be working at IRISA — the largest French research laboratory in the field of computer science and information technologies (more than 850 people). IRISA provides an exciting environment where French and international researchers perform cutting edge scientific activities in all domains of computer science.

Rennes is located in the West part of France in the beautiful region of Brittany. From Rennes, you can reach the sea side in about 45~minutes by car and Paris center in about 90~minutes by train. Rennes is a nice and vibrant student-friendly city. It is often ranked as one of the best student cities in France. Rennes is known and appreciated for its academic excellence, especially in the field of cybersecurity, its professional landmark, the quality of its student life, the affordability of its housing offer, its rich cultural life, and much more.

Sujet :
Context and goal

Health data, social networks, electricity consumption… Vast quantities of personal data are collected today by private companies or public organizations. Various legal, monetary, or visibility incentives push data holders to envision sharing versions of the collected datasets that provide both statistical utility and privacy guarantees. Indeed, sharing data at large, e.g., as open data, without jeopardizing privacy, is expected to bring strong benefits (strengthening, e.g., scientific studies, innovation, public policies).

Synthetic data generation is a promising approach. First, synthetic data generation algorithms aim at generating datasets that are as close as possible to the original datasets. Either synthetically generated data or the generative models trained over the original data could be shared for supporting elaborate data analysis. Second, substantial progress has been made during the last decade about the privacy guarantees of synthetic data generation algorithms. For example, there exist today synthetic data generation algorithms that satisfy variants of differential privacy, one of the most prominent family of privacy models [2].

However security is a constant race between the attackers and the defenders. A large number of attacks exists and keeps growing [5]. As a result, because of the complex environment in which synthetic data generation takes place (e.g., utility needs, diversity of information sources, diversity of data generation algorithms), analyzing the risks remains hazardous even when strong privacy-preserving techniques are used.

The main goal of this PhD thesis is to design a formal method based approach allowing data holders to analyze the risks related to their synthetic data publication practices.

The main tasks of the PhD student will be to:
– Study the state-of-the-art about attacks on synthetic data generation algorithms (e.g., membership inference attacks [4, 6]) and about relevant formal methods (e.g., attack tree based risk analysis models [3]). We will focus on tabular data and time series.
– Model the full synthetic data generation environment. Most especially, this includes capturing the attackers’ capabilities (e.g., goals [5], background knowledge, computational resources, sequences of steps), the relationships between attackers, the sources of auxiliary information, and the data sharing practices.
– Design efficient algorithms for finding the attacks that illustrate privacy risks, implement them, and evaluate their performance.

In addition to the core tasks of the project, the successful candidate will also contribute to the organisation of competitions where the privacy guarantees of synthetic data generation algorithms are challenged [1] (see, e.g., the Snake1 challenge (https://snake-challenge.github.io)).

References

[1] Tristan Allard, Louis Béziaud, and Sébastien Gambs. Snake challenge: Sanitization algorithms under attack. Proceedings of the 32nd ACM International Conference on Information and Knowledge Management (CIKM ’23), 2023.

[2] Damien Desfontaines and Balázs Pejó. Sok: Differential privacies. Proceedings on Privacy Enhancing Technologies, 2020(2):288–313, 2020.

[3] Barbara Kordy (Fila), Ludovic Piètre-Cambacédès, and Patrick Schweitzer. Dag-based attack and defense modeling: Don’t miss the forest for the attack trees. Comput. Sci. Rev., 13-14:1–38, 2014.

[4] Hongsheng Hu, Zoran A. Salcic, Lichao Sun, Gillian Dobbie, P. Yu, and Xuyun Zhang. Membership inference
attacks on machine learning: A survey. ACM Computing Surveys (CSUR), 54:1 – 37, 2021.

[5] Ahmed Salem, Giovanni Cherubin, David Evans, Boris Köpf, Andrew Paverd, Anshuman Suri, Shruti Tople, and Santiago Zanella-Béguelin. Sok: Let the privacy games begin! a unified treatment of data inference privacy in machine learning. In Proceedings of the 2023 IEEE Symposium on Security and Privacy (S&P ’23), pages 327–345, 2023.

[6] Antonin Voyez, Tristan Allard, Gildas Avoine, Pierre Cauchois, Élisa Fromont, and Matthieu Simonin. Membership inference attacks on aggregated time series with linear programming. In Proceedings of the 19th International Conference on Security and Cryptography (SECRYPT ’22), 2022.

Profil du candidat :
– The candidate must have obtained, or be about to obtain, a master degree in computer science or in a related field.
– The candidate must be curious, autonomous, and rigorous.
– The candidate must be able to communicate in English (oral and written). The knowledge of the French language is not required.
– The candidate must have a strong interest in cybersecurity.
– Skills in machine learning and/or formal methods will be appreciated.

Formation et compétences requises :

Adresse d’emploi :
IRISA Rennes
Campus de Beaulieu, 263 avenue du Général Leclerc
35042 RENNES cedex

Document attaché : 202501201621_PhD_thesis_IRISA_France.pdf

Categories: theses

Jul

Thu

Conscience de groupe à l’ère de l’IA : Modèles et outils pour le maintien de la cohésion et la prévention du désengagement dans les équipes en télétravail

Jul 31 – Aug 1 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : Centre de Recherche en Informatique, Université Pa
Durée : 3 ans
Contact : Manuele.Kirsch-Pinheiro@univ-paris1.fr
Date limite de publication : 2025-07-31

Contexte :
La généralisation du télétravail a profondément transformé les modes de collaboration en entreprise, entraînant une multiplication des plateformes numériques (Teams, Zoom, Jira, etc.) et une dispersion des informations. Cette évolution s’accompagne de nouveaux défis : perte de cohésion, isolement, difficulté à suivre l’état d’avancement des projets, et augmentation du risque de désengagement ou de détresse psychologique chez les collaborateurs.

Sujet :
La thèse se positionne au carrefour de deux domaines de recherche : le travail coopératif assisté par ordinateur (TCAO ou CSCW en anglais), avec sa notion de conscience de groupe (group awareness), et l’Intelligence Artificielle, et notamment ses techniques de Machine Learning.

La thèse vise à :
• Proposer des modèles d’IA et de Machine Learning pour extraire, à partir des traces d’activités multi-plateformes, les informations pertinentes à la conscience de groupe (group awareness) ;
• Détecter précocement les signes de désengagement ou de détresse chez les membres d’une équipe ;
• Concevoir des mécanismes personnalisés de diffusion d’information et d’alerte, respectant la vie privée (RGPD, AI Act), pour renforcer la cohésion et le bien-être des équipes ;
• Explorer des approches de Federated Learning et d’apprentissage incrémental pour l’adaptation des modèles à chaque contexte collaboratif.

Profil du candidat :
• Master en informatique avec spécialisation data science, IA, ou domaine connexe.
• Intérêt pour les problématiques humaines et organisationnelles du travail collaboratif.
• Esprit d’initiative, autonomie, et capacité à travailler en équipe interdisciplinaire.

Formation et compétences requises :
• Solide maîtrise en Machine Learning, traitement de données, et systèmes distribués.
• Des connaissances ou expériences précédentes dans le domaine du CSCW seraient appréciées.
• Bonnes compétences en communication scientifique (français et anglais).

Adresse d’emploi :
Centre de Recherche en Informatique, Université Paris 1 Panthéon-Sorbonne, Centre Pierre Mendes-France (Paris 13ème)

Document attaché : 202506301529_PropositionSujetThese-annonce.pdf

Categories: theses

Aug

Sun

Thèse en intelligence artificielle appliquée au traitement du cancer — Reims

Aug 31 – Sep 1 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : CReSTIC / Institut Godinot / AQUILAB
Durée : 36 mois
Contact : Arnaud.BEDDOK@reims.unicancer.fr
Date limite de publication : 2025-08-31

Contexte :

Sujet :
Cf. document pdf.

Profil du candidat :

Formation et compétences requises :

Adresse d’emploi :
CReSTIC
Université de Reims Champagne Ardenne

Document attaché : 202505121850_Appel_candidature_20250512.pdf

Categories: theses

Oct

Wed

Indexing and retrieval of visual contents in 3D point clouds at large scale – Application to spatialization

Oct 15 – Oct 16 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : LASTIG, IGN / Gustave Eiffel University
Durée : 3 years
Contact : valerie.gouet@ign.fr
Date limite de publication : 2025-10-15

Contexte :
PhD offer
Indexing and retrieval of visual contents in 3D point clouds at large scale – Application to spatialization
LASTIG Lab / IGN and Gustave Eiffel University / Paris great area, France

All the details: https://agape-anr.github.io/docs/annonce_these_loc2D3D-EN.pdf

Sujet :
At a glance

The thesis project focuses on the spatialization of visual contents (both image and video contents) by the exploitation of 3D references at large scale. Without any a priori about geolocation, the problem is tackled by the retrieval of the most similar elements in the geolocalized reference. As visual content, we consider old photographs and footages made available from cultural institutions, and as 3D reference we exploit LiDAR data mapping the French territory, made available at the country scale by the French mapping agency (IGN). This PhD thesis has the ambition to address two challenging scientific problems: on the one hand, the description, matching and indexing of 2D(+t) and 3D data in a multi-date context where the scene has evolved over time, and on the other hand, the fast retrieval in very large volumes of data. The work will be carried out within the framework of the multidisciplinary project AGAPE, which addresses the discoverability and investigation in spatial iconographic heritage, and gathers seven leading partners specialized in visual and multimodal AI, Multimedia and Human-Computer Interaction as well as in Archives, History and Media.

Keywords

Computer Vision, Artificial Intelligence, Indexing and Retrieval, Vision Languages Models, Image analysis, 3D Point Clouds, Big Data, Geolocalization, Cultural Heritage.

Profil du candidat :
How to apply

Before July 14, 2025, please send to both contacts in a single PDF file the following documents: o A detailed CV
o A topic-focused cover letter
o Grades and ranks over the last 3 years of study
o The contact details of 2 referents who can recommend you

Candidatures which do not respect these instructions will not be considered.

Auditions will be conducted during period July 15-23; decision released no later than July 25.

Formation et compétences requises :

Adresse d’emploi :
IGN-ENSG, Université Gustave Eiffel
6-8 Av. Blaise Pascal, 77420 Champs-sur-Marne
FRANCE

Document attaché : 202506021533_annonce_these_loc2D3D-EN.pdf

Categories: theses

Oct

Fri

Combinaison LLM et GNN pour la fusion de représentations multimodales : Application à l’extraction d’information dans les données semi-structurées

Oct 31 – Nov 1 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : LISTIC – Université Savoie Mont-Blanc
Durée : 3 ans
Contact : jean-yves.ramel@univ-smb.fr
Date limite de publication : 2025-10-31

Contexte :
La thèse sera encadrée par David Télisson et JY Ramel
Au sein de l Equipe ReGards du LISTIC,
Dans le cadre de la chaire MIAI FONDUE (https://miai-cluster.univ-grenoble-alpes.fr/)

Sujet :
Combinaison LLM et GNN pour la fusion de représentations multimodales : Application à l’extraction d’information dans les données semi-structurées liées aux activités humaines ou professionnelles.

Détails: https://www.univ-smb.fr/listic/wp-content/uploads/sites/66/2025/09/these2025fondue.pdf

Profil du candidat :
Master en informatique, data science, intelligence artificielle,

Formation et compétences requises :
– Compétences solides en machine learning, NLP ou traitement d’images.
– Intérêt pour les approches multi-modales et les architectures hybrides.
– Maîtrise de Python, Pytorch/Tensorflow, et des bibliothèques LLM

Adresse d’emploi :
Université Savoie-Mont-Blanc
LISTIC – Bat 2D
Campus Savoie-Technolac
73376 Le BOURGET du LAC cedex

Document attaché : 202509251147_these2025fondue.pdf

Categories: theses

Nov

Sun

Modélisation hybride d’arbre fruitier en 3D associant structure-fonction et deep-learning – Application à la conception de vergers agro-écologiques

Nov 30 – Dec 1 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : UMR AGAP Institut, CIRAD, Montpellier
Durée : 3 ans
Contact : frederic.boudon@cirad.fr
Date limite de publication : 2025-11-30

Contexte :
Les modèles structure-fonction (FSPM) permettent d’analyser finement le fonctionnement et la croissance des plantes dans des environnements fluctuants. Ils simulent l’interaction entre la structure modulaire de la plante, sa géométrie (distribution spatiale 3D) et les processus physiologiques en interaction avec l’environnement (Prusinkiewicz, 2004 ; Fourcaud et al., 2008 ; Louarn et Song, 2020). Ses modèles considèrent que la structure tridimensionnelle des plantes constitue à la fois son interface avec l’environnement et un déterminant majeur de leur croissance et leur productivité (Costes et al., 2006). Ils sont particulièrement mobilisés pour l’étude et la modélisation d’arbres fruitiers (Costes et al., 2008 ; Allen et al., 2005 ; Lescourret et al., 2011 ; Boudon et al., 2020) où la compétition interne pour les ressources entre organes exige des représentations dynamiques et spatialisées. Toutefois, un obstacle majeur réside dans la paramétrisation de ces modèles, qui limite leur adoption pour le développement d’outils d’aide à la décision en gestion de vergers (DeJong, 2019) et, plus largement, freine leur diffusion au sein de la communauté scientifique.

La télédétection, associée aux méthodes d’analyse basées sur le deep learning, offre un fort potentiel pour caractériser le fonctionnement et la croissance des plantes, et ainsi contribuer à la paramétrisation des modèles structure-fonction. L’émergence récente de capteurs variés (caméras RGB, LiDAR, thermiques, etc.) et de plateformes d’acquisition (drones, phénomobiles, etc.) ouvre de nouvelles perspectives pour le phénotypage haut débit et le suivi des vergers. Plusieurs initiatives récentes visent à automatiser le phénotypage des arbres, mais elles se focalisent généralement sur un nombre restreint de traits, souvent insuffisant pour alimenter un FSPM de manière complète (Streit et al., 2023).

Dans ce contexte, l’objectif de cette thèse est de développer une nouvelle génération de modèles FSPM d’arbres fruitiers, hybridant les approches classiques de modélisation avec des données haut débit issues du phénotypage en verger. En s’appuyant notamment sur les projets Gardens et PHENET, l’utilisation de FSPM paramétrés par des données de phénotypage haut débit permettra de produire des jumeaux numériques et de caractériser et explorer “in silico” la résilience de systèmes agricoles.

Un enjeu majeur des approches FSPM est de pouvoir reproduire et simuler des structures topologiques décrivant l’architecture de la plante et leurs informations géométriques ou physiologiques associées, notamment issues de la télédétection. Ces structures peuvent être décomposées en séquences qui représentent par exemple la ramification le long des axes de la plante. Des méthodes statistiques dédiées (Guédon et al., 2001) ont été développées par la communauté scientifique pour pouvoir analyser et simuler ces séquences. Récemment, les grands modèles de langage (LLM) ont connu une évolution remarquable, révolutionnant le traitement du langage naturel et trouvant des applications dans divers domaines scientifiques. Ils reposent principalement sur des architectures de réseaux de neurones avancées, parmi lesquelles les Transformers (Vaswani et al., 2017) jouent un rôle central. Contrairement aux modèles séquentiels classiques comme les RNN (Recurrent Neural Networks) ou les LSTM (Long Short-Term Memory), les Transformers exploitent un mécanisme d’attention permettant de traiter les données en parallèle plutôt que de manière séquentielle. Ce mécanisme, dit Self-Attention, pondère l’importance de chaque élément dans une séquence par rapport aux autres, améliorant ainsi la capture des dépendances à longue portée dans une séquence. Par ailleurs, d’autres approches comme les Autoencodeurs Variationnels (VAE) (Kingma & Welling, 2013) sont également utilisées dans certains modèles de génération, notamment pour apprendre des représentations latentes structurées du langage. Ces approches ouvrent des perspectives prometteuses pour leur application en modélisation FSPM, notamment en facilitant l’apprentissage et la génération automatique de structures arborescentes représentant l’architecture des plantes.

Sujet :
Lors de la première étape, ce projet s’appuiera sur des modèles FSPM existants dans la plateforme libre OpenAlea, tels que MappleT (pommier), dans lesquels la structure des arbres est modélisée par des processus stochastiques (p. ex. semi-chaînes de Markov cachées) calibrés à partir de relevés de croissance dont l’acquisition et l’analyse sont coûteuses en temps et en expertise. Une première étape de la thèse consistera à étendre un modèle d’arbre FSPM en générant la structure arborescente à l’aide de “Large Language Models” (LLM), notamment des réseaux Transformers ou des Variational Autoencoders (VAE), afin de générer la succession des organes et leurs types associés. Les observations et les sorties des modèles statistiques déjà calibrés serviront à entraîner et à paramétrer ces réseaux.

Une deuxième étape sera de simuler un modèle FSPM d’arbre fruitier contraint par des données LIDAR, issues des projets PHENET (pommier) et Gardens (citrus). A partir de ces scans, des structures topologiques augmentées d’information géométriques seront générées. Et les réseaux entraînés précédemment seront étendus pour permettre la génération de ces structures et de leurs informations associées. Un enjeu majeur consistera à développer des codages relatifs (paramétrisation des entités en fonction des paramètres du nœud parent) adaptés à ces informations pour garantir une génération séquentielle cohérente des éléments de l’architecture.

Enfin, dans une troisième étape, nous explorerons l’utilisation de descriptions partielles à certaines phases clés de la croissance. Par exemple, des reconstructions LiDAR pourraient être disponibles uniquement au début et à la fin d’un cycle de croissance, tandis que des observations plus approximatives (vols de drone estimant le volume global de la plante, distribution spatiale de la végétation, etc.) pourraient être obtenues à intervalles réguliers. Dans ce contexte, un apprentissage par renforcement sera utilisé pour calibrer les modèles de croissance. Ce cadre permettra d’alterner entre l’exploitation des données existantes pour optimiser la génération de la structure et l’exploration de nouvelles configurations possibles

Dans une dernière étapes, ces méthodes seront appliquées pour reconstruire un verger en 3D à partir d’informations de phénotypage (drones, LIDAR) puis de simuler des processus biophysiques difficilement observables comme l’interception de la lumière ou le stress hydrique afin d’estimer la résilience du système, la distribution des ressources dans ces systèmes (lumière, eau) et de proposer de nouveaux traits (dans notre cas représentés comme des paramètres de modèles), de nouvelles variétés, et de nouvelles conduites (densité d’arbres, taille, association d’espèces) permettant d’optimiser ces systèmes.

Profil du candidat :
Titulaire d’un master en informatique ou d’un diplôme d’ingénieur avec des compétences en deep learning et idéalement en 3D.
Programmation en Python et C/C++.
Intérêt pour la biologie et l’agronomie.

Formation et compétences requises :

Adresse d’emploi :
CIRAD, Phenomen team, UMR AGAP
Avenue Agropolis TA A-108 / 01
34398 Montpellier Cedex 5, France

Document attaché : 202510131618_these-assimilation-vf-2025.pdf

Categories: theses

Dec

Monitoring traditional agricultural crop fields with multi-modal multi-temporal Synthetic Aperture Radar data

Dec 15 – Dec 16 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : SIEO Lab Romania/LISTIC France
Durée : 36 mois
Contact : yajing.yan@univ-smb.fr
Date limite de publication : 2025-12-15

Contexte :

Sujet :
Distinct agricultural crops and practices play a central role in shaping the culture and cultural heritage of rural communities in specific regions. The Brașov region in Romania, for instance, is particularly renowned for its potato and sugar beet cultivation, which has earned it the designation ‘Potato Country’. However, these traditional crops are increasingly being replaced by others, such as rapeseed, which are more resilient and better adapted to changing climate conditions. This shift contributes to the loss of cultural heritage. Remote sensing, and in particular Synthetic Aperture Radar (SAR), provides valuable insights into vegetation structure, soil roughness, and soil moisture. The Copernicus program of the European Commission, together with other space agencies, offers free and regularly updated data for the long-term monitoring of agricultural systems. In this project, conducted in close collaboration between French and Romanian research units, we aim at contributing to the preservation of cultural heritage in the selected region of Romania, while the ultimate goal is to take steps towards the development of global strategies to address climate change. In order to reach the aim, we leverage multi-modal, multi-temporal SAR data to (i) quantify the impact of climate change on traditional agricultural crops, (ii) estimate the water demand of these crops, (iii) evaluate nature-based solutions to preserve soil quality, and (iv) predict future dynamics.

Research aims: leverage multi-modal (multi-frequency, multi-resolution, multi-polarization, complex signal/amplitude/interferometric coherence/phase), multi-temporal SAR data to monitor the crop growth, crop fields roughness and moisture evolution.

Methodology:
• Analyze the historical data of agricultural crops on the identified cultural heritage sites in Romania
• Multi-modal multi-temporal SAR data collection and pre-processing
• Perform SAR data analysis for the assessment of climate change impact
– crop structure evolution analysis
– soil roughness evolution analysis
– soil moisture evolution analysis
– correlation analysis with in situ data
• Predict the future dynamics with meteorological data
• Create open access data sets and tutorials for the community

Profil du candidat :
We seek for Ph.D candidates with Master degrees on remote sensing, environment and geosciences, information science. Good English skill is necessary for communication. The Ph.D student will spend 24 months in Romania and 12 months in France.

Formation et compétences requises :

Adresse d’emploi :
Space Intelligence and Earth Observation Research Laboratory, Transilvania University of Brasov, Romania,
LISTIC, University Savoie Mont Blanc, Annecy, France

Document attaché : 202511051450_PhD_subject_crop_monitoring.pdf

Categories: theses

Dec

Secure Federated Querying of Knowledge Graphs

Dec 22 – Dec 23 all-day

Offre en lien avec l’Action/le Réseau : – — –/Doctorants

Laboratoire/Entreprise : LS2N
Durée : 3 year
Contact : hala.skaf@univ-nantes.fr
Date limite de publication : 2025-12-22

Contexte :
I am seeking excellent candidates for a fully funded 3-year Phd position funded by the ANR SaFE-KG project.
Goal: Formalise, design, and implement a secure, efficient federation engine enabling LLM-like querying across sensitive biomedical knowledge graphs, with fine-grained access control and provenance.

Sujet :
In the context of SaFE-KG, the main objectives of the thesis is to design and implement an Efficient and Secure Federation Engine able to:

Query decentralized knowledge graphs under fine-grained access control policies.
Ensure high performance and scalability in secure federations.
Interact with LLM to support query building
Return results enriched with provenance and usage control information.
Support adaptive query processing techniques, including secure sampling.

Profil du candidat :
Solid background in Semantic Web, knowledge graphs, SPARQL; familiarity with sampling and/or ML/LLMs is a plus.

Formation et compétences requises :
Master’s in CS/IS (strong ranking).

Adresse d’emploi :
LS2N, Nantes Université

Document attaché : 202509221432_SujetThèse-Safe-KG-5.pdf

Categories: theses

Dec

Wed

Deep Generative Models of Physical Dynamics: Representation, Generalization, and Multiphysics Learning

Dec 31 2025 – Jan 1 2026 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : ISIR – Institut des Systèmes Intelligents et de Ro
Durée : 36 mois
Contact : patrick.gallinari@sorbonne-universite.fr
Date limite de publication : 2025-12-31

Contexte :
AI4Science is an emerging research field that investigates the potential of AI methods to advance scientific discovery, particularly through the modeling of complex natural phenomena. This fast-growing area holds the promise of transforming how research is conducted across a broad range of scientific domains. One especially promising application is in modeling complex dynamical systems that arise in fields such as climate science, earth science, biology, and fluid dynamics. A diversity of approaches is currently being developed, but this remains an emerging field with numerous open research challenges in both machine learning and domain-specific modeling.

Generative modeling is transforming machine learning by enabling the synthesis of plausible, high-dimensional data across modalities like text, images, and audio. A similarly profound shift is underway in the sciences, where generative deep learning is being leveraged to model complex physical dynamics governed by partial differential equations (PDEs)—especially in cases where traditional simulations are computationally expensive.

The central goal of the PhD project is to investigate whether deep generative architectures—such as diffusion, flow-matching, or autoregressive transformer-based sequence models—can be designed to simulate, generalize, and interpolate physical dynamics across a wide range of parametric and multiphysics regimes. Building on recent advances in neural surrogate modeling, this research will aim to advance generalizable, cross-physics generative modeling.

Sujet :
RESEARCH OBJECTIVES

The overarching research question is: Can we develop generative models that learn structured, physically grounded representations of dynamical systems—enabling synthesis, adaptation, and generalization across physical regimes and multiphysics settings? It unfolds into several complementary directions:

LATRENT GENERATIVE MODELS FOR PHYSICAL DYNAMICS

The objective is to design generative models—such as diffusion, flow-matching, or autoregressive models—that learn compact and interpretable latent representations of spatiotemporal dynamics governed by PDEs. These models should:

• Capture uncertainty and multimodality in solution trajectories.
• Generalize across parametric variations.

LEARNING ACROSS MULTIPHYSICS SYSTEMS

To enable transfer learning across heterogeneous physics, we will explore shared latent representations across families of PDEs:
• Using encode–process–decode frameworks.
• Applying contrastive or multitask training to uncover reusable physical abstractions.
• Designing models invariant to space/time resolution and units.
This direction builds toward foundation-like models that capture generalizable physics priors across simulation families.

FEW-SHOT and IN-CONTEXT GENERALIZATION TO NEW PHYSICS

To support scientific modeling in data-scarce settings, we will develop methods for few-shot generalization such as:
• Fine-tuning latent priors to new PDE systems using limited examples.
• Exploring meta-learning and prompt-based adaptation techniques (inspired by in-context learning in language models).
• Incorporating known physical constraints into the generative process.
The goal is to enable rapid and physically consistent adaptation to previously unseen dynamics with minimal data and supervision.

Profil du candidat :

Computer science or applied mathematics.
Master degree in computer science or applied mathematics, Engineering school.

Formation et compétences requises :
Good programming skills. Background and experience in machine learning.

Adresse d’emploi :
Sorbonne Université (S.U.), Pierre et Marie Campus in the center of Paris. The candidate will integrate the MLIA team (Machine Learning and Deep Learning for Information Access) at ISIR (Institut des Systèmes Intelligents et de Robotique).

Document attaché : 202505191314_2025-05-01-PhD-Description-Generative-models-Physics.pdf

Categories: theses

Gender dynamics in collaboration networks

Dec 31 2025 – Jan 1 2026 all-day

Offre en lien avec l’Action/le Réseau : – — –/Doctorants

Laboratoire/Entreprise : Laboratoire Informatique d’Avignon avec codirect
Durée : 3 ans
Contact : rosa.figueiredo@univ-avignon.fr
Date limite de publication : 2025-12-31

Contexte :
ANR project EVA – EValuating gender policies in academia through the Analysis of scientific collaboration networks.

Sujet :
https://eva.univ-avignon.fr/wp-content/uploads/sites/34/2025/04/offre.pdf

Profil du candidat :
• Master’s degree (or equivalent) in Computer Science, Applied Mathematics, Operations Research, or a related field.
• Strong ability to write and present research clearly.
• Proficiency in Python, R, Julia or C++, with experience in AI and optimization algorithms.
• Good understanding of graph theory, machine learning, and network analysis.
• Ability to work well in an interdisciplinary team.
• Proficiency in English is required, and knowledge of French is an advantage

Formation et compétences requises :

Adresse d’emploi :
LIA, Avignon

Document attaché : 202504251721_offreThesis_EVA.pdf

Categories: theses

Neural networks based volcanic model inversion with SAR displacement measurements

Dec 31 2025 – Jan 1 2026 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : LISTIC
Durée : 36 mois
Contact : yajing.yan@univ-smb.fr
Date limite de publication : 2025-12-31

Contexte :

Sujet :
Satellite based remote sensing offers a unique source of information to monitor the environnement, with fine spatial resolution, wide coverage and frequent revisit. This enables
addressing the challenge of natural hazard monitoring and forecasting, which has a significant societal impact. The inverse modeling of surface displacement is one of the major techniques of exploring the subsurface feature of volcanoes. The traditional Monte Carlo direct search approaches are
computational resources and time consuming, thus cannot respond to operational needs. We will explore the potential of deep learning in volcanic inverse modeling with Interferometry
Synthetic Aperture Radar (InSAR) for operational monitoring and forecasting of volcanic hazards. The intrinsic ill-posedness of inversions in volcanology and limited amount of labeled InSAR data make this work challenging. We tackle the problem of volcanic model inversion, i.e. to estimate model parameters from surface displacement estimations issued from InSAR by solving an inverse problem. This Ph.D thesis will elaborate on our previous proof-of-concept work where a frugal ResNet model was deployed for the first time to estimate the volume change and depth of a spherical volcanic source (i.e. Mogi) from synthetic InSAR displacement fields. This ResNet model exhibits distinct advantages of computational efficiency over the state-of-the-art Monte Carlo direct search methods. For this thesis, the Ph.D student will use more sophisticated volcanic models (e.g. fracture, numerical boundary element models, etc.) allowing for simulations of displacement fields caused by more complex volcanic sources to further increase the generality of the previously proposed ResNet model. One main effort will be devoted to the improvement of the ResNet model prediction accuracy by increasing training data diversity (e.g. divers SAR
acquisition geometries, near field/far field and multi-resolution measurements) and by elaborating more adapted loss functions corresponding to appropriate model properties to optimize (e.g. combination of a loss function of estimated model parameters and a loss function of the reconstructed displacement field). These two latter actions also help minimize the ill-posedness. Real InSAR displacement measurements related to both intrusion and reservoir type worldwide volcanoes will be used to fine-tune the ResNet model trained by synthetic data for
further validation in real applications.

Profil du candidat :
The Ph.D candidate should have good skills in machine learning. Knowledge in inverse problem or geophysics is appreciated.

Formation et compétences requises :

Adresse d’emploi :
LISTIC, 5 chemin de bellevue, 74944, Annecy-le-Vieux

Categories: theses

Thu

PhD and postdoc offers at Telecom Paris, Institut Polytechnique de Paris

Jan 1 – Jan 2 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : DIG team, Télécom Paris
Durée : 39 months
Contact : nils.holzenberger@telecom-paris.fr
Date limite de publication : 2026-01-01

Contexte :

Sujet :
Hello,

We are hiring 2 PhD students and 1 postdoc to work on combining language models with structured data, at Telecom Paris, Institut Polytechnique de Paris. Start date can be between January and March 2026.

Large Language Models are amazing, and with our research project, we aim to make them even more amazing! Our project will connect large language models to structured knowledge such as knowledge bases or databases. With this,

1. language models will stop hallucinating

2. language models’ knowledge can be audited and updated reliably, to spot biases and make them more interpretable

3. language models will become smaller and thus more eco-friendly and deployable

We work in the DIG team at Telecom Paris, one of the finest engineering schools in France, and part of Institut Polytechnique de Paris — ranked 38th in the world by the QS ranking. The institute is 45 min away from Paris by public transport, and located in the green of the Plateau de Saclay.

Check out our Web site to apply: https://suchanek.name/work/research/kb-lm/index.html

Fabian Suchanek & Nils Holzenberger

Profil du candidat :

Formation et compétences requises :

Adresse d’emploi :
19 place Marguerite Perey
91120 Palaiseau
FRANCE

Categories: theses

On importance sampling for probability estimation of high-dimensional rare events with finite intrinsic dimensions

Jan 5 – Jan 6 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : ISAE SUPAERO (Toulouse)
Durée : 3,6 ans
Contact : florian.simatos@isae.fr
Date limite de publication : 2026-01-05

Contexte :

Sujet :
Cf pdf

Profil du candidat :

Formation et compétences requises :

Adresse d’emploi :
Toulouse

Document attaché : 202511080545_high-dimensional-IS-Mai-Simatos.pdf

Categories: theses

Fri

Apprentissage frugal de modèles génératifs multimodaux en contexte industriel

Jan 16 – Jan 17 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : IRT SystemX
Durée : 36 mois
Contact : faicel.chamroukhi@irt-systemx.fr
Date limite de publication : 2026-01-16

Contexte :
L’IRT SystemX propose une thèse sur l’apprentissage frugal de modèles génératifs multimodaux en contexte industriel. La thèse s’inscrit dans le cadre d’un projet collaboratif sur l’IA Générative pour l’Industrie, mené en partenariat avec notamment Air Liquide et Michelin, et son volet applicatif vise à adresser des cas d’usage industriels liés à la gestion de connaissances techniques en ingénierie de systèmes complexes.

Le poste est basé à Palaiseau et la thèse sera inscrite à l’école doctorale STIC de l’Université Paris-Saclay.

La thèse est financée pour une durée de 36 mois, avec une rémunération de 2784 € brut/mois, pour un démarrage souhaité début 2026.

Voici pdf ci-joint pour plus de détails sur le contexte.

Sujet :
Apprentissage frugal de modèles génératifs multimodaux en contexte industriel. Le volet applicatif vise à adresser des cas d’usage industriels liés à la gestion de connaissances techniques en ingénierie de systèmes complexes.

Voici pdf ci-joint pour plus de détails sur le sujet.

Profil du candidat :
Le(la) candidat(te) doit justifier d’un Master Recherche (ou formation équivalente avec un intérêt avéré pour la recherche) dans le domaine des sciences des données et de l’Intelligence Artificielle.

Formation et compétences requises :
Master Recherche ou équivalent en sciences des données et Intelligence Artificielle.
Intérêt marqué pour la recherche et goût pour les applications.
Solides compétences en inférence statistique et en optimisation.
Maîtrise de l’apprentissage profond.
Programmation en Python, avec expérience PyTorch/TensorFlow.
Des compétences sur les modèles d’IA générative serait un plus.

Pour postuler, merci d’envoyer les éléments suivants au format PDF à : faicel.chamroukhi@irt-systemx.fr
CV détaillé
Lettre de motivation
Relevés de notes des deux dernières années d’étude de Master ou de cycle ingénieur
Au moins une lettre de recommandation

Adresse d’emploi :
IRT SystemX,
2 Boulevard Thomas Gobert
91120, Palaiseau

Document attaché : 202510310836_Offre-de-These-IRTSystemX-DIT-2-2026-IAG1.pdf

Categories: theses

Sat

Étude des transitoires en radio astronomie

Jan 31 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : LPC2E – Laboratoire de Physique et Chimie de l’Env
Durée : 3 ans
Contact : cherry.ng-guiheneuf@cnrs-orleans.fr
Date limite de publication : 2026-02-28

Contexte :
Les récentes avancées technologiques ont permis aux astronomes de numériser le ciel radio à une fraction de seconde près. Cette résolution temporelle sans précédent offre une sensibilité aux phénomènes transitoires qui nous auraient autrement échappé. Les transitoires à longue période (LPT) en sont un excellent exemple : il s’agit d’une nouvelle classe émergente de sources radio cohérentes qui remettent en question notre compréhension de la physique des émissions des étoiles à neutrons. Contrairement aux pulsars canoniques, qui présentent des périodes de rotation de quelques millisecondes à quelques secondes, les LPT émettent périodiquement sur des échelles de temps de quelques dizaines de secondes à quelques minutes, voire quelques heures. La découverte des LPT est tout à fait inattendue ; on pensait depuis longtemps qu’à mesure que les étoiles à neutrons ralentissaient et perdaient progressivement leur énergie de rotation, la production de paires et les émissions radio cohérentes devaient cesser au-delà de la « ligne de mort des pulsars ».

L’existence de ces émetteurs à longue période soulève des questions fondamentales sur la manière dont l’émission cohérente est générée dans les magnétosphères lorsque la chute de potentiel disponible est insuffisante pour maintenir les cascades de paires. À ce jour, seule une douzaine de LPT ont été découverts, bien que la nature extrêmement intermittente de nombre d’entre eux suggère que beaucoup d’autres objets de ce type restent à détecter. La compréhension des LPT est essentielle pour faire progresser les modèles de magnétosphères des étoiles à neutrons, tester les limites de l’accélération des particules et de la génération de plasma, et potentiellement découvrir les liens évolutifs entre les pulsars, les magnétars et d’autres phénomènes radio transitoires tels que les sursauts radio rapides (FRB). En bref, l’étude des LPT offre une occasion unique d’explorer à la fois la physique des émissions cohérentes et l’évolution tardive des étoiles à neutrons. Elle permet également d’étudier les systèmes binaires de naines blanches, car au moins certains des LPT semblent être des systèmes « polaires » en interaction, dans lesquels un pont magnétique se forme entre une naine blanche et une autre étoile de faible masse.

Sujet :
Afin d’augmenter la taille de l’échantillon LPT et obtenir une image plus complète, nous exploiterons la multitude de données du futur radiotélescope CHORD, un instrument de nouvelle génération actuellement en construction au Canada et dont la mise en service est prévue en 2027. Grâce aux récentes avancées technologiques, CHORD disposera de deux capacités uniques : une vitesse de cartographie du ciel sans précédent et une couverture quotidienne répétée du ciel, les deux ingrédients clés pour une étude réussie des pulsars.

Ce projet est entièrement financé par une subvention nationale de l’ANR. Le/la doctorant(e) participera à l’optimisation et au réglage précis de modules spécifiques d’algorithmes de traitement du signal, au traitement et à la modélisation de séries temporelles. Il/elle pourra également s’investir dans l’exploitation d’algorithmes basés sur l’apprentissage automatique (ML) afin de réduire les faux positifs causés par les signaux parasites générés par l’homme dans les données d’observation (par opposition aux signaux astrophysiques). Le/la candidat(e) participera également à la gestion du traitement des données et à l’évaluation des résultats de recherche. À la fin du doctorat, nous attendons de l’étudiant(e) qu’il/elle maîtrise parfaitement le traitement des signaux radioastronomiques et qu’il/elle soit devenu(e) un(e) expert(e) en analyse de données dans le domaine temporel, en particulier dans le domaine des pulsars et des transitoires rapides. Le/la candidat(e) travaillera au suivi des découvertes à l’aide du Grand Radiotélescope de Nançay (NRT) et du télescope NenuFAR de l’Observatoire Radioastronomique de Nançay en France.

Le/la candidat(e) sera accueilli(e) par l’équipe ASTRO au LPC2E à Orléans. L’équipe dispose du plus grand groupe de recherche sur les pulsars en France et est étroitement liée à l’Observatoire Radioastronomique de Nançay, situé dans la forêt de Sologne. Le/la candidat(e) aura également l’occasion de voyager pour collaborer avec d’autres instituts partenaires, ainsi que de présenter ses travaux de recherche lors de conférences internationales. Un ordinateur portable lui sera fourni, ainsi que l’accès aux ressources informatiques nécessaires. Les candidat(e)s sont invité(e)s à contacter les encadrants (Cherry Ng-Guiheneuf, Gilles Theureau) pour discuter plus en détail.

Profil du candidat :
La candidature doit être accompagnée d’un CV détaillé (2 pages maximum), d’une copie des relevés de notes officiels de licence (et de master, le cas échéant), avec une indication claire du système de notation et du classement de l’étudiant·e au sein de sa promotion, de deux lettres de recommandation envoyées directement à cherry.ng-guiheneuf@cnrs-orleans.fr par les référents avant la date limite, ainsi que d’une lettre de motivation présentant la motivation du/de la candidat(e) à se former en tant que chercheur(euse) en astrophysique, et plus particulièrement au LPC2E/CNRS, son expérience de recherche, ses centres d’intérêt scientifiques, ainsi que ses projets et objectifs de carrière.

Candidater sur: https://www.abg.asso.fr/fr/candidatOffres/show/id_offre/134797

Formation et compétences requises :
Le/la candidat(e) doit être titulaire d’un master en astrophysique (ou dans un domaine étroitement lié), être disponible à temps plein et avoir une bonne maîtrise de l’anglais. Une expérience préalable dans le domaine de la recherche est un atout.

Adresse d’emploi :
L’équipe ASTRO du LPC2E/CNRS à Orléans (https://www.lpc2e.cnrs.fr/en/astrophysique) est le plus grand groupe de recherche sur les pulsars en France et travaille en étroite collaboration avec l’Observatoire Radioastronomique de Nançay, situé dans la forêt de Sologne.

La collaboration CHORD est une équipe multi-institutionnelle comptant plus de 100 membres. Parmi les instituts partenaires figurent notamment l’Université de Toronto, l’Université McGill, l’Institut Perimeter, l’Université de Colombie-Britannique, le MIT, l’INAF et le CNRS. Le réseau central du télescope CHORD est hébergé à l’Observatoire radio-astrophysique Dominion (DRAO) sur la côte ouest du Canada, et deux stations satellites seront construites aux observatoires de Green Bank et Hat Creek aux États-Unis.

Coordonnées des encadrants : cherry.ng-guiheneuf@cnrs-orleans.fr , Gilles.Theureau@obspm.fr

Categories: theses

Feb

Sat

Low-Resource Bias Evaluation and Mitigation for Large Language Models

Feb 28 – Mar 1 all-day

Offre en lien avec l’Action/le Réseau : – — –/Doctorants

Laboratoire/Entreprise : SAMOVAR, Télécom SudParis, IP Paris
Durée : 3 Years
Contact : luca.benedetto@telecom-sudparis.eu
Date limite de publication : 2026-02-28

Contexte :

Sujet :
We are offering a PhD Thesis position to perform research on NLP and AI Fairness, in the ACMES Team of the SAMOVAR lab at IP Paris.

In this project, we will work on low-resource techniques for bias detection and mitigation in large language models, across a variety of domains, from education to recommendation systems.

You can find a detailed description of this position at the following link: https://partage.imt.fr/index.php/s/qzGZz4KBCMnsJso

To apply, please reach out to luca.benedetto@telecom-sudparis.eu

Interviews will be held in late January.

Profil du candidat :

Formation et compétences requises :
Candidates are expected to have experience with machine learning in general, and specifically with NLP.

Familiarity with frameworks and libraries such as langchain, vLLM, ollama is preferred.

Adresse d’emploi :
Evry ou Palaiseau (France)

Document attaché : 202601071133_sujetThese_ED626_68505.pdf

Categories: theses

Mar

Fri

FLEX-E: Explainable Hybrid Federated Learning for Energy Optimization in Industrial Parks

Mar 6 – Mar 7 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : INSA Strasbourg / Laboratoire ICube
Durée : 36 mois
Contact : franco.giustozzi@insa-strasbourg.fr
Date limite de publication : 2026-03-06

Contexte :
Industrial parks are major contributors to global energy consumption and CO2 emissions due to their high demand, heterogeneous energy users, and complex energy flows. Improving energy efficiency in these environments is therefore a key lever for achieving climate targets, reducing operational costs, and strengthening regional competitiveness, particularly in industrially dense regions such as the Upper Rhine area. Despite their importance, conventional energy management systems are typically designed as isolated solutions. They lack the capability to address large-scale challenges such as decentralized energy optimization, integration of renewable energy sources (e.g. photovoltaic systems, waste heat recovery), and coordinated load balancing across multiple stakeholders. While collaborative energy platforms offer significant potential, their real-world deployment is constrained by strict requirements regarding data security and privacy, scalability, and adaptability to changing industrial infrastructures.
The FLEX-E project1 addresses these challenges by introducing a collaborative energy optimization framework based on Federated Learning (FL). FL is a decentralized machine learning paradigm in which local entities—such as buildings, energy producers, or consumers—train models locally and share only abstracted model parameters rather than raw data. This approach enables cross-organizational learning while preserving data sovereignty, ensuring privacy, and supporting scalable deployment. In FLEX-E, this federated approach is combined with energy flow modeling based on digital twins and validation in real and planned industrial park testbeds of varying sizes. The project thus provides a unique foundation for advanced research into secure, data-driven, and collaborative energy management systems for industrial environments.

Sujet :
The increasing electrification of industry, coupled with the integration of renewable energy sources and flexible loads, has significantly increased the complexity of energy management in industrial parks. These environments are characterized by heterogeneous assets, distributed ownership, and strict requirements regarding data privacy and operational confidentiality. Traditional centralized energy management systems struggle to scale under these constraints and often fail to fully exploit collaborative optimization potentials.
This PhD project aims to advance the state of the art by developing an explainable and hybrid federated learning framework for energy optimization in industrial parks, building upon the FLEX-E project. The proposed approach combines data-driven federated learning with expert knowledge, including physics-based energy models, digital twins and knowledge graphs, to improve robustness, generalization, and trustworthiness of AI-based energy management systems.
[Full description in the attached file.]

Profil du candidat :
We are looking for a highly motivated PhD candidate with a Master (or engineer) degree (Bac+5 level) with a strong background in computer science or data science or energy systems, or a closely related field.

Formation et compétences requises :
Experience with Python and common ML frameworks (e.g. PyTorch, TensorFlow) is expected. A background or demonstrated interest in energy systems, smart grids, or industrial energy management is highly desirable. Familiarity with physical modeling, optimization, or digital twins is an advantage. Interest in explainable AI, hybrid modeling, or knowledge graphs is a plus.

Adresse d’emploi :
INSA Strasbourg.
24 Bd de la Victoire, 67000 Strasbourg.

Document attaché : 202602171120_Thesis_proposal_FLEX_E.pdf

Categories: theses

Mar

Doctorat – Modèles génératifs pour la détection dense d’événements rares dans l’imagerie multimodale de télédétection

Mar 23 – Mar 24 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : IGN/CNAM/ONERA
Durée : 36 mois
Contact : nicolas.audebert@ign.fr
Date limite de publication : 2026-03-23

Contexte :
https://recrutement.cnes.fr/fr/annonce/4195913-26-252-dense-detection-of-rare-events-in-remote-sensing-using-generative-models-75003-paris

Sujet :
L’objectif principal de la thèse est de développer de nouvelles méthodologies pour la détection dense d’événements rares dans des images de télédétection multimodales, incluant des données optiques, radar (SAR) et d’autres sources multimodales. En particulier, ce travail ciblera les capteurs Sentinel-1/2, SPOT-6/7, Pléiades/Neo afin de combiner plusieurs modalités, résolutions et fréquences de revisite, pour localiser des anomalies géographiques provoquées par événements exceptionnels, comme des catastrophes naturelles.

Cette thèse s’inscrit dans la thématique large de la détection de changement en télédétection multimodale. La méthode ne demande pas de cibler un type d’anomalies particulier. Les anomalies détectées pourraient être des inondations, des feux de forêt, des avalanches, de la fonte de la neige… De nombreux jeux de données sont déjà existants : tel que xView², Burn Scars HLS, SEN12Flood, ainsi qu’un jeu de données de détection de la fonte de neige préparée à l’ONERA.

L’approche proposée s’appuie sur l’état de l’art en apprentissage faiblement supervisé, notamment à l’aide de modèles génératifs. En effet, ces modèles apprendre la distribution des images de façon non supervisée et permettent d’obtenir des scores de vraisemblance, qui peuvent être transformées en scores d’anomalies. Cependant, lorsque seule une partie de l’image est anormale, les scores obtenus à l’échelle de l’image peuvent ne pas refléter cette anomalie, surtout si la zone anormale contient peu de pixels. Ceci nécessite la mise en place de détection d’anomalie à l’échelle du pixel et non plus à l’échelle de l’image. De plus, les anomalies que l’on cherche à détecter en télédétection ne sont pas « hors distribution » de façon générale, mais le sont conditionnellement à un lieu, un instant d’acquisition et un capteur. Enfin, la thèse se place dans un contexte multimodal, où plusieurs types d’imagerie peuvent être utilisés : aérien, SPOT-6/7, Pléiades et Pléiades Neo, Sentinel-2, voire Sentinel-1. Il est donc nécessaire de conditionner les méthodes génératives au type d’imagerie qui permettent de contenir l’information commune sous-jacente (l’information sémantique sur les objets à la surface de la Terre) et de détecter les anomalies, quel que soit le capteur.

1. Dans un premier temps, on suppose que l’on sait qu’un évènement rare est dans une image et on applique des méthodes de détection de zone d’intérêt non supervisées ou supervisées par le langage pour le localiser. Cette étape permet d’évaluer quelle méthode de détection de zone d’intérêt serait la plus pertinente pour les évènements rares que l’on cherche à détecter.
2. Dans un deuxième temps on cherche à modifier les détecteurs d’évènements rares par modèle génératif en s’appuyant sur la méthode sélectionnée à l’étape 1 pour un détecter d’évènement par des modèles génératifs au sein d’une image. De plus, cette étape nécessite l’étude de conditionnement spatial et temporel des méthodes génératives en télédétection pour améliorer l’estimation de la vraisemblance.

Profil du candidat :

Formation et compétences requises :

Adresse d’emploi :
IGN – Géodata Paris, 6-8 avenue Blaise Pascal, 77420 Noisy-Champs

Cnam, 2 rue Conté 75003 Paris

Categories: theses

Mar

Tue

Self-improving AI Agents for Recommendation

Mar 31 – Apr 1 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : Criteo AI Lab Paris
Durée : 36 mois
Contact : p.gallinari@criteo.com
Date limite de publication : 2026-03-31

Contexte :
As part of its ongoing transformation into an agentic-ready platform, Criteo is spearheading the integration of agentic AI across its full portfolio. These systems are already being deployed to automate internal operations, assist clients in the management and optimization of advertising campaigns, and to power personal shopping agents—autonomous assistants that act on behalf of end-users. These agents must reason, remember, and act autonomously in environments characterized by uncertainty, variability, and scale.
To fulfill this vision, one of the most pressing challenges is adaptability. Our agents must function across an extremely heterogeneous client base — each with unique product catalogs, optimization targets, and interface constraints while interacting with users and inferring their intents.

Sujet :
The objective of the PhD is to explore adaptation strategies to multiple and heterogeneous environments and user segments for an agentic system. In our setting these environments might correspond to different partners characterized by their own catalog, objective and strategy while user segments refer to user preferences or needs. We will restrict our scope to language-only agents and emphasize practical assistant scenarios.

In most scenarios, adaptation to new environments and to user intents shall leverage simple and computationally costless strategies, while being able to adapt for scarce data contexts available for these new settings. Adaptation places a significant demand on the system’s memory, which must be more than a static repository of facts. It must be an adaptive memory system, capable of restructuring and reprioritizing information as the user’s context evolves. Therefore, self-adaptation is intrinsically linked to memory management. The goal is to endow the agent with the ability to learn how to manage its own memory in response to a changing environment and user. The PhD will start to investigate different memory strategies and their potential for handling adaptation to new environments and to user interaction. We will explore mechanisms for the agent to develop learned policies for memory operations. Key research questions include:

• Learned Retention and Forgetting: How can an agent learn what information is critical to retain versus what is obsolete and should be forgotten or archived?

• Adaptive Retrieval Strategies: Can an agent learn the most effective way to query its memory? We will explore how the system can dynamically choose between different retrieval methods (e.g., vector-based RAG, evolving LLM context), based on the task.

• Automated Memory Summarization: How can the system “reflect” on its interaction history to create higher-level insights?
We will investigate techniques for the agent to periodically summarize streams of memories into more abstract knowledge (e.g., consolidating multiple shopping interactions into a persistent preference like “user prefers sustainable brands”).

Adaptation mechanism shall also be an element contributing to the planning mechanism of the agent: how can an agent make decisions when the goal is weakly defined, the feedback is sparse, and the environment varies by client? This is particularly relevant in domains like travel planning or multi-product recommendations, where a “one-size-fits-all” approach is neither feasible nor desirable. To complement memory-based methods, off-line reinforcement learning strategies could be considered.

Profil du candidat :
We are looking for a motivated researcher with a strong foundation in machine learning, natural language processing, applied maths. Familiarity with large language models, transformers, reinforcement learning, or continual learning will be considered a strong asset. Above all, we are seeking someone who is excited by the challenge of bringing intelligent agents to life in practical, high-impact applications.

Formation et compétences requises :
Master degree in computer science or applied mathematics, Engineering school. Background and experience in machine learning.

Adresse d’emploi :
Criteo AI Lab Paris

Document attaché : 202510021236_2025-10-Criteo-PhD proposal-Agents-LLMs.pdf

Categories: theses

Apr

Sat

Robust-to-noise information extraction, unifying challenges of optical character recognition (OCR) and automatic speech recognition (ASR)