NeOWL4j : création d’un éditeur d’ontologie moderne basé sur l’environnement Neo4j

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : Laboratoire d’Informatique et Systèmes
Durée : 3 à 6 mois
Contact : alexis.guyot@lis-lab.fr
Date limite de publication : 2026-06-01

Contexte :
L’ingénierie des connaissances vise à modéliser, structurer et exploiter des savoirs pour les rendre manipulables par des systèmes informatiques. Au cœur de cette démarche, une ontologie est une représentation formelle d’un domaine : elle définit des concepts (classes), leurs relations (propriétés)
et des contraintes/axiomes (p. ex. hiérarchies, cardinalités). Les ontologies favorisent l’interopérabilité sémantique entre systèmes hétérogènes, facilitent l’intégration des données, soutiennent le raisonnement (inférences, vérification de cohérence) et encadrent la gouvernance des connaissances au moyen de référentiels partagés. Des outils comme Protégé font aujourd’hui autorité pour l’édition d’ontologies.

Le stage consiste à prototyper une alternative à Protégé en développant une surcouche d’édition et de raisonnement au-dessus du SGBD orienté graphe Neo4j, combinant ergonomie moderne, exploitation riche de graphe et compatibilité OWL (OWL API, moteurs de raisonnement standards). L’enjeu est de livrer une expérience contemporaine et efficace, sans réinventer les composants qui existent déjà lorsqu’ils sont adaptés.

Sujet :
**But du stage**
Concevoir et prototyper une application moderne (plutôt web, mais desktop possible) servant de sur-couche à Neo4j pour créer, éditer, valider et raisonner sur des ontologies (OWL/SWRL), avec une attention forte portée à l’UX et à l’esthétique. L’application doit ester interopérable avec l’écosystème existant (dont Protégé) tout en capitalisant sur les forces de Neo4j.

**Objectifs et missions**
Le/la stagiaire commencera par cadrer formellement le projet : étude de Protégé (exploration libre de l’outil, lecture de la documentation, entretiens avec des experts pour cerner besoins et limites de l’existant) ; rapide panorama des fonctionnalités des autres éditeurs d’ontologies, éventuellement complété par un album de captures pour comparer l’UX ; identification des contraintes techniques de l’existant (écosystème Neo4j, briques OWL/SWRL, validation et raisonnement) et repérage des bonnes pratiques UX 2025 pour guider la conception.

Sur cette base, il/elle rédigera des spécifications fonctionnelles et techniques pour le nouvel outil, puis développera un prototype de manière incrémentale : éditeur de d’ontologies ergonomique connecté à Neo4j, import/export assurant l’interopérabilité, mécanismes de validation, etc. Selon la durée et le profil,
le stage pourra s’étendre à l’édition d’axiomes et de règles, à l’intégration d’un raisonneur standard et à la création d’un démonstrateur complet sur une ontologie de référence.

**Technologies envisagées**
Côté interface, l’option prioritaire est une application web en TypeScript s’appuyant sur React ou SvelteKit, avec un composant d’édition de graphes adapté (par ex. React Flow ou Cytoscape.js), des moteurs de layout (elkjs/dagre) et un système de design moderne (Tailwind avec composants accessibles type Radix/shadcn). Cette combinaison permet de viser une UX actuelle : thèmes (y compris dark mode), accessibilité, performance (virtualisation), micro-interactions sobres.

En alternative desktop, on pourra empaqueter l’interface web via Electron ou Tauri, ou opter pour une interface native en JavaFX (Java) ou JetBrains Compose for Desktop (Kotlin), afin de faciliter l’intégration directe avec les bibliothèques web sémantique de l’écosystème Java.

Pour le backend, une pile Java avec Spring Boot est privilégiée afin d’intégrer naturellement OWL API/Apache Jena, de s’interfacer avec un raisonneur standard (HermiT, Pellet, Fact++), et de dialoguer avec Neo4j via le driver Java et neosemantics (n10s) pour les échanges RDF/OWL. La validation pourra
s’appuyer sur SHACL. L’API sera exposée simplement (REST/JSON ou gRPC) et restera découpée de façon à pouvoir évoluer (microservice dédié aux fonctions ontologiques si nécessaire).

Profil du candidat :
— Niveau : Bac+3 à Bac+5 (informatique / BD / IA / génie logiciel / IHM).
— Dominantes possibles : développement front moderne (TS + React/SvelteKit), Java et conception d’API, bases de données, graphes, web sémantique (OWL/RDF, SWRL, SHACL), UX/UI.
— Qualités attendues : autonomie, rigueur, curiosité, sens de l’ergonomie, communication.

Formation et compétences requises :

Adresse d’emploi :
LIS UMR 7020 CNRS / AMU / UTLN, équipe IACD
Aix Marseille Université – Campus de Saint Jérôme – Bat. Polytech
52 Av. Escadrille Normandie Niemen
13397 Marseille Cedex 20

Document attaché : 202510011418_2025_Sujet_Stage_NeOWL4J.pdf

Bayesian inference for cosmology: Inferring the initial fields of our cosmic neighborhood

Offre en lien avec l’Action/le Réseau : DatAstro/– — –

Laboratoire/Entreprise : CRIStAL / Université de Lille / CNRS / Centrale Li
Durée : 18 mois
Contact : jenny.sorce@univ-lille.fr
Date limite de publication : 2025-12-31

Contexte :
The project is part of the Chaire WILL UNIVERSITWINS (UNIVERSe dIgital TWINS) led by Jenny Sorce (funded by the Université de Lille under the initiative of excellence). The successful candidate will be jointly supervised by Jenny Sorce (CNRS Researcher in cosmology) and Pierre Antoine Thouvenin (Assoc. Prof., Centrale Lille), and hosted in the CRIStAL lab (UMR 9189), Lille, France. The work will be conducted in collaboration with Jean Prost (Assoc. Prof., ENSEEIHT) in the IRIT lab. More than 2000 GPU.hours have already been secured for the project at TGCC on the Irene/Rome partition. They will be used to finetune, validate and deploy the surrogate model to perform Bayesian inference. Access to the medium scale computing center from the University of Lille is also ensured.

Lien vers le site du projet : https://sorcej.github.io/Jenny.G.Sorce/universitwins.html

Lien vers l’offre d’emploi : https://sorcej.github.io/Jenny.G.Sorce/jobads/postdocuniversitwins.pdf

Sujet :
According to the standard cosmological model, about 95% of the Universe is dark. Recent large survey analyses reveal tensions with this model. For instance, the local measurement of the expansion rate and the estimate of the Universe homogeneity differ by more than three standard deviations from those inferred with the first light of the Universe. These discrepancies are at the heart of a heated debate in cosmology to determine whether these tensions require new physical models to be acccounted for, or are mere consequences of systematic biases in the observation processing pipeline. Part of this pipeline relies on cosmological simulations to act as the missing ground truth. However, the simulations only reproduce the statistics of the local cosmic web. A new type of simulations, qualified as constrained, is emerging. Initial velocity and density fields of such simulations stem from observational constraints.

Profil du candidat :
PhD in signal/image processing, computer science or applied mathematics.

Formation et compétences requises :
The project requires a strong background in data science and/or machine learning (statistics, optimization), signal & image processing. Very good Python coding skills are expected. A B2 English level is mandatory. Knowledge in C++ programming, as well as experience or interest in parallel/distributed code development (MPI, OpenMP, CUDA, …) will be appreciated.

Adresse d’emploi :
UMR CRIStAL
Université de Lille – Campus scientifique
Bâtiment ESPRIT
Avenue Henri Poincaré
59655 Villeneuve d’Ascq

Internship Subject M2 – Integrating Earth observation data and deep learning methods to monitor food systems

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : CIRAD – UMR TETIS
Durée : 6 mois
Contact : roberto.interdonato@cirad.fr
Date limite de publication : 2025-12-31

Contexte :
Food systems are highly interconnected between countries on a global scale, as shown by recent disruptions such as the war in Ukraine and the global pandemic. Food flows are vulnerable to shocks, and these disruptions influence food prices, which in turn affect food consumption patterns. This has had a significant impact on people’s diets, particularly in underdeveloped countries where food security is already fragile. However, scientists and policy-makers lack the data and tools to identify weak points in food flows and build food systems resilient to shocks and disruptions. While considerable progress has been made using Earth Observation data to map crop locations and agricultural productivity (e.g. crop yields), little attention has been paid to the intermediate stages of the workflow – distribution, processing and markets – which are key to understanding and modeling how food moves from production to consumption. Additionally, numerous geospatial datasets, such as OpenStreetMap, are publicly accessible and provide valuable information on land use and land cover.

Thanks to advances in artificial intelligence and its application to Earth Observation data, continuously collected satellite images on a global scale, combined with meteorological data, make it possible to monitor food systems in real time. Deep learning models, capable of capturing complex, non-linear relationships, and multimodal algorithms integrating data from a variety of sources, are opening up new perspectives in this field. This internship proposes to exploit multi-temporal and multi-resolution Earth observation data, by combining them with learning models, to monitor food systems, estimate agricultural yields and analyze their links with market prices.

This internship focuses on developing machine learning approaches to analyze food flows in Rwanda, in relation to food security situation in the country, by using comprehensive market data and geospatial information. Food flows often deviate from optimal distribution patterns due to infrastructure constraints, market dynamics, and socio-economic factors. For example, a certain product (e.g., potatoes) grown in northern regions may follow suboptimal routes to reach southern markets. By modeling both ideal and actual food flows, we can identify bottlenecks and opportunities to improve food security.

Sujet :
Missions :

The project aims to understand the relationship between food production locations, distribution networks, and market accessibility to inform food security policies. More specifically, the final task is to build a machine learning model able to predict the probability that a certain item is sold in a specific market, based on production and distribution data.

The project leverages two primary datasets:

· Public Market Dataset: 1.2 million items across 70 markets covering 10 types of food items.

· CGIAR/IITA Survey Database: A dataset collected by the IITA (International Institute of Tropical Agriculture) including monthly data from 7,000 vendors across 67 markets in all districts of Rwanda, including food quality assessments and detailed market information.

These datasets will be complemented by geospatial data including OpenStreetMap (OSM) infrastructure data, land cover information, and Earth observation data (NDVI and other spectral indices).

The main tasks to address during the internship will be:

1. Database Integration and Market Mapping

a. Merge the public market dataset with CGIAR/IITA survey data to create a comprehensive market database

b. Map which specific food items are sold in which markets

2. Geospatial Data Integration

a. Incorporate OpenStreetMap data to understand transportation networks and market accessibility

b. Integrate land cover and agricultural production data to identify food production zones

c. Process Earth observation data (NDVI, meteorological data) to assess agricultural productivity

d. Map the complete food system from production areas to consumption markets

3. Machine Learning Model Development

a. Develop predictive models to estimate the probability that specific food items will be available in particular markets

b. Compare actual food flows with modeled optimal flows to identify inefficiencies

c. Test developed models against baseline methodologies and state-of-the-art approaches

4. Writing of the internship report (in English) to capitalize on the work carried out with a view to a possible scientific publication. If possible, also release associate code and data.

Profil du candidat :
Skills required :

– Programming skills

– Interest in data analysis

– Scientific rigor

– Curiosity and open-mindedness

– Analytical, writing and summarizing skills

How to apply :

Send CV, cover letter and M1 (or 4th year) transcript to :

simon.madec@cirad.fr , roberto.interdonato@cirad.fr

specifying as e-mail subject “CANDIDATURE STAGE DIGITAG”.

Additional Information :

– Duration of 6 months, starting February 2025

– Remuneration: CIRAD salary scale, ~600 euros/month

– The internship will take place at CIRAD, in the UMR TETIS (Territory, Environment,

Remote Sensing and Spatial Information), located at the Maison de la

Télédétection in Montpellier.

– The internship will be carried out in collaboration with Assistant Professor

Claudia Paris and Yue Dou, currently working at the ITC Faculty of Geographic Information

Science and Earth Observation, University of Twente, Netherlands.

Formation et compétences requises :
Skills required :

– Programming skills

– Interest in data analysis

– Scientific rigor

– Curiosity and open-mindedness

– Analytical, writing and summarizing skills

How to apply :

Send CV, cover letter and M1 (or 4th year) transcript to :

simon.madec@cirad.fr , roberto.interdonato@cirad.fr

specifying as e-mail subject “CANDIDATURE STAGE DIGITAG”.

Additional Information :

– Duration of 6 months, starting February 2025

– Remuneration: CIRAD salary scale, ~600 euros/month

– The internship will take place at CIRAD, in the UMR TETIS (Territory, Environment,

Remote Sensing and Spatial Information), located at the Maison de la

Télédétection in Montpellier.

– The internship will be carried out in collaboration with Assistant Professor

Claudia Paris and Yue Dou, currently working at the ITC Faculty of Geographic Information

Science and Earth Observation, University of Twente, Netherlands.

Adresse d’emploi :
500 rue Jean François Breton, 34090, Montpellier

Journée d’étude de l’action MusiScale

Annonce en lien avec l’Action/le Réseau : Musiscale

Thème :

Du son à la musique : comment se constituent les objets musicaux

Présentation :

Les données musicales constituent une masse d’information considérable qui est cependant mal exploitée du fait de l’absence de paradigme générique permettant de rendre de rendre compte de leurs relations de similarité, et ce à diverses échelles de représentation.
Par similarité, on entend des relations simples qui permettent d’expliquer et de formuler les correspondances entre éléments, séquences temporelles, sections, albums, oeuvres, voire corpus musicaux, en tenant compte de leurs spécificités mais en veillant à faire appel à un paradigme général. Les éléments musicaux s’organisent en effet à différentes échelles, au sein d’un corpus, dans le temps, etc.
La journée d\’étude, sans se limiter à cette question, s\’intéressera particulièrement à la formation des objets sonores à partir de données audio : comment sont-ils perçus, comment les détecter, comment les intégrer dans une structure musicale multi-échelle.

Du : 2025-10-03

Au : 2025-10-03

Lieu : Maison de la recherche, 28 rue Serpente, 75006 Paris

Site Web : https://www.madics.fr/actions/musiscale/

Tomographie optique diffuse de fluorescence pour la reconstruction d’images hyper-spectrales

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : institut Fresnel
Durée : 4 à 6 mois
Contact : andre@fresnel.fr
Date limite de publication : 2026-02-02

Contexte :
Les technologies d’imagerie capables de détecter les processus biologiques précoces in vivo de manière non invasive pour des études longitudinales, avec une haute résolution, représentent un défi pour la recherche biomédicale. Le concept de notre système d’imagerie repose sur un nouveau d’imagerie optique diffuse de fluorescence multicolore pour l’imagerie in vivo du petit animal en trois dimensions (3D) dans la fenêtre NIR-II (1000-2000 nm). La tomographie optique diffuse de fluorescence consiste à injecter au sujet (ici une souris) des substances chimiques qui se fixent sur différents organes. Ces substances chimiques, appelées fluorophores, sont alors excitées par une source lumineuse puis réémettent de la lumière lors de leur relaxation, à plus faible énergie (plus longue longueur d’onde). L’objectif est de reconstruire des images à partir de ce signal de fluorescence. Le signal de fluorescence ainsi que la source d’excitation peuvent être atténués à la fois par l’absorption et la diffusion des différents milieux traversés, ce qui entraîne une distorsion des spectres mesurés. La reconstruction des images est généralement un problème mal-posé nécessitant l’utilisation d’algorithmes d’optimisation exploitant des connaissances apriori sur les volumes à reconstruire.

Sujet :
Le but du stage est le développement d’algorithmes de reconstruction spécifiques aux images hyper-spectrales i.e. lorsque le sujet est excité avec différentes longueurs d’onde et que le signal de fluorescence est échantillonnée à plusieurs longueurs d’onde. Les connaissances apriori sur les volumes à reconstruire seront estimées à l’aide d’algorithmes de deep learning.

Profil du candidat :
Le candidat recruté devra avoir être en dernière année d’école d’ingénieurs ou en Master 2 dans le domaine des mathématiques appliquées, le traitement du signal/images ou dans une formation équivalente. Il devra être particulièrement à l’aise en programmation (python/Matlab) et avoir une réelle appétence pour les interactions entre l’informatique et la physique.

Formation et compétences requises :

Adresse d’emploi :
52 Av. Escadrille Normandie Niemen, 13013 Marseille

Document attaché : 202509290900_stage tomo hyper spectral.pdf

Postdoc offer at Telecom Paris, Institut Polytechnique de Paris

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : DIG team, Télécom Paris
Durée : 12 months
Contact : nils.holzenberger@telecom-paris.fr
Date limite de publication : 2026-01-01

Contexte :

Sujet :
Hello,

We are hiring 2 PhD students and 1 postdoc to work on combining language models with structured data, at Telecom Paris, Institut Polytechnique de Paris. Start date can be between January and March 2026.

Large Language Models are amazing, and with our research project, we aim to make them even more amazing! Our project will connect large language models to structured knowledge such as knowledge bases or databases. With this,

1. language models will stop hallucinating

2. language models’ knowledge can be audited and updated reliably, to spot biases and make them more interpretable

3. language models will become smaller and thus more eco-friendly and deployable

We work in the DIG team at Telecom Paris, one of the finest engineering schools in France, and part of Institut Polytechnique de Paris — ranked 38th in the world by the QS ranking. The institute is 45 min away from Paris by public transport, and located in the green of the Plateau de Saclay.

Check out our Web site to apply: https://suchanek.name/work/research/kb-lm/index.html

Fabian Suchanek & Nils Holzenberger

Profil du candidat :

Formation et compétences requises :

Adresse d’emploi :
19 place Marguerite Perey
91120 Palaiseau
FRANCE

PhD and postdoc offers at Telecom Paris, Institut Polytechnique de Paris

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : DIG team, Télécom Paris
Durée : 39 months
Contact : nils.holzenberger@telecom-paris.fr
Date limite de publication : 2026-01-01

Contexte :

Sujet :
Hello,

1. language models will stop hallucinating

2. language models’ knowledge can be audited and updated reliably, to spot biases and make them more interpretable

3. language models will become smaller and thus more eco-friendly and deployable

Check out our Web site to apply: https://suchanek.name/work/research/kb-lm/index.html

Fabian Suchanek & Nils Holzenberger

Profil du candidat :

Formation et compétences requises :

Adresse d’emploi :
19 place Marguerite Perey
91120 Palaiseau
FRANCE

Réunion ComDir

Appel à Projets – FR Agorantic

Date : 2025-09-24 => 2025-11-10

Comme levier de sa politique scientifique, la FR Agorantic met en place, sur la dotation du CNRS, un appel à projets Passerelle, pour aider des projets dans leur phase initiale. Ces projets doivent répondre à une problématique et/ou à explorer une méthodologie originale dans un cadre interdisciplinaire entre informatique et SHS.

Éligibilité

Le porteur ou la porteuse principal·e du projet doit être membre du GDR MaDICS ou MAGIS, et/ou être rattaché à un laboratoire membre d’Agorantic.
Le projet doit associer au moins 2 laboratoires couvrant des champs disciplinaires différents, l’un en informatique et l’autre en SHS.

Durée des projets : 1 an

Montant maximum alloué : 8 000 €

Lien direct

Notre site web : www.madics.fr
Suivez-nous sur Tweeter : @GDR_MADICS
Pour vous désabonner de la liste, suivre ce lien.

Combinaison LLM et GNN pour la fusion de représentations multimodales : Application à l’extraction d’information dans les données semi-structurées

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : LISTIC – Université Savoie Mont-Blanc
Durée : 3 ans
Contact : jean-yves.ramel@univ-smb.fr
Date limite de publication : 2025-10-31

Contexte :
La thèse sera encadrée par David Télisson et JY Ramel
Au sein de l Equipe ReGards du LISTIC,
Dans le cadre de la chaire MIAI FONDUE (https://miai-cluster.univ-grenoble-alpes.fr/)

Sujet :
Combinaison LLM et GNN pour la fusion de représentations multimodales : Application à l’extraction d’information dans les données semi-structurées liées aux activités humaines ou professionnelles.

Détails: https://www.univ-smb.fr/listic/wp-content/uploads/sites/66/2025/09/these2025fondue.pdf

Profil du candidat :
Master en informatique, data science, intelligence artificielle,

Formation et compétences requises :
– Compétences solides en machine learning, NLP ou traitement d’images.
– Intérêt pour les approches multi-modales et les architectures hybrides.
– Maîtrise de Python, Pytorch/Tensorflow, et des bibliothèques LLM

Adresse d’emploi :
Université Savoie-Mont-Blanc
LISTIC – Bat 2D
Campus Savoie-Technolac
73376 Le BOURGET du LAC cedex

Document attaché : 202509251147_these2025fondue.pdf

MaDICS

Masses de Données, Informations et Connaissances en Sciences

Big Data - Data Science

Archives

NeOWL4j : création d’un éditeur d’ontologie moderne basé sur l’environnement Neo4j

Bayesian inference for cosmology: Inferring the initial fields of our cosmic neighborhood

Internship Subject M2 – Integrating Earth observation data and deep learning methods to monitor food systems

Journée d’étude de l’action MusiScale

Tomographie optique diffuse de fluorescence pour la reconstruction d’images hyper-spectrales

Postdoc offer at Telecom Paris, Institut Polytechnique de Paris

PhD and postdoc offers at Telecom Paris, Institut Polytechnique de Paris

Réunion ComDir

Appel à Projets – FR Agorantic

Combinaison LLM et GNN pour la fusion de représentations multimodales : Application à l’extraction d’information dans les données semi-structurées