Offre de thèse sur l’apprentissage profond d’arbres binaires de partition pour l’analyse d’images

Offre en lien avec l’Action/le Réseau : – — –/Doctorants

Laboratoire/Entreprise : Université de Caen Normandie / laboratoire GREYC U
Durée : 3 ans
Contact : olivier.lezoray@unicaen.fr
Date limite de publication : 2023-03-31

Contexte :
Offre de thèse en Informatique à l’Université de Caen Normandie / laboratoire GREYC UMR CNRS 6072

Titre : Apprentissage profond d’arbres binaires de partitions pour l’analyse d’images

Mots-Clés : Représentation hiérarchiques, Arbres binaires de partition, Apprentissage Profond, Ultramétriques.

Sujet :
Sujet
—–
Il existe de nombreuses représentations des images numériques, chacune adaptée à différents contextes. Dans cette thèse nous nous intéressons aux représentations hiérarchiques des images. Ces dernières permettent, à partir d’une sur-segmentation d’une image en super-pixels, de procéder à des fusions de régions à différentes échelles. De telles représentations hiérarchiques permettent donc de capturer les caractéristiques des images à différentes échelles simultanément, et sont facilement interprétables et manipulables par un humain. Construire des représentations hiérarchiques de bonne qualité est alors une étape très importante de l’analyse des images. En analyse d’images, les arbres binaires de partitions (ABP) sont une représentation hiérarchique populaire. Leur construction repose sur plusieurs éléments clés: une partition initiale, un modèle de région, un critère de fusion, un ordre de fusion. Cette construction de l’ABP repose alors souvent sur des descripteurs de régions peu adaptés aux données et sur des méthodes heuristiques et gloutonnes de clustering hiérarchique. Nous proposons de tirer parti de l’apprentissage profond pour la construction et la manipulation d’ABPs. La construction de l’arbre pourra alors exploiter des descripteurs profond de super-pixels, apprendre la similarité entre ces descripteurs et enfin disposer d’un critère de fusion appris. Une ultramétrique étant une représentation duale d’une représentation hiérarchique, des méthodes d’apprentissage profond peuvent être envisagées pour apprendre non pas l’ABP mais directement l’ultramétrique à partir d’un graphe représentant la sur-segmentation et en minimisant explicitement une fonction de coût. La segmentation sémantique d’une image pourra être ensuite vue comme soit une labelisation apprise des sommets de l’ABP, soit l’apprentissage d’une coupe dans l’ABP. Un arbre étant un graphe, des réseaux de neurones à convolution sur graphes pourront être envisagés pour cela (la convolution et le pooling étant là très particuliers étant donné la structure d’arbre du graphe). Enfin, Des applications en santé (mélanome de la peau) et en imagerie satellitaire seront effectuées.

Profil du candidat :
Les candidats doivent être titulaires d’un master ou d’un diplôme d’ingénieur dans un domaine lié à l’informatique ou aux mathématiques appliquées, et posséder de solides compétences en programmation (en particulier avec des cadres d’apprentissage profond). Une expérience dans le domaine du traitement des images sera un atout. Les candidats doivent être capables de rédiger des rapports scientifiques et de communiquer les résultats de leurs recherches lors de conférences en anglais.

Formation et compétences requises :

Adresse d’emploi :
Caen

Document attaché : 202302151546_sujetTheseLezoray2023_fr.pdf

Abductive Reasoning with Minimal Sensing in a Home Environment

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : Mines Saint-Etienne
Durée : 3 ans
Contact : victor.charpenay@fau.de
Date limite de publication : 2023-03-31

Contexte :
The thesis is equally funding by ANR (Agence Nationale de la Recherche) and elm.leblanc, one of the leading home automation system vendors. One of the main technical challenges in modern home automation is to using Artificial Intelligence (AI) to minimize the energy consumption of technical systems without loss of comfort. For instance, the production of hot water can be optimized by dynamically adapting the temperature of water and the time of use of the boiler based on activities monitored in the home. The general objective of the thesis is to monitor human activities without ubiquitous sensing capabilities.

Sujet :
The domain of research of the thesis is knowledge representation and reasoning, a subfield of AI. Its objective is to evaluate abductive reasoning methods over sensor measurements performed in a home environment. Abductive reasoning in this context consists in finding logically sound hypotheses (e.g. ‘the dishwasher is on’) that explain observed sensor measurements (‘electric consumption has risen in the last two hours’) according to a model of human activity in a home.

The baseline assumption of the thesis is that only minimal sensing is available in the home, as is the case in most homes today: smart meters provide aggregated values (every hour/day) but no information is available per room. Abductive reasoning is expected to help optimize home automation systems without relying on some ubiquitous sensing apparatus (which raises environmental, technical and privacy-preservation questions).

Several abduction mechanisms will be evaluated, including Abductive Logic Programming (for an exhaustive exploration of hypothesis space) and neural-symbolic integration methods (for a probabilistic exploration of hypothesis space).

Profil du candidat :
Prior knowledge in AI is expected, either in neural networks or in computational logics, logic programming and/or Semantic Web technologies. Basic understanding of statistical inference methods and linear programming is also considered important. Technical skills required for the thesis include: multi-paradigm programming (Java, Lisp, R, Prolog, …), data modeling (UML, OWL, E/R, BPMN, …), Linux system administration (Bash, SSH, Docker, …).

Autonomy and curiosity are important soft skills to compete a PhD thesis.

Formation et compétences requises :
Holder of a Master’s degree in computer science or data science. Prior knowledge in AI is expected, either in neural networks or in computational logics, logic programming and/or Semantic Web technologies. Basic understanding of statistical inference methods and linear programming is also considered important. Technical skills required for the thesis include: multi-paradigm programming (Java, Lisp, R, Prolog, …), data modeling (UML, OWL, E/R, BPMN, …), Linux system administration (Bash, SSH, Docker, …).

Adresse d’emploi :
Espace Fauriel, Saint-Etienne

Document attaché : 202302151304_phd-offer.pdf

Representation Learning for Geographic Spatio-Temporal Generalisation

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : ICube – Université de Strasbourg
Durée : 3 ans
Contact : lampert@unistra.fr
Date limite de publication : 2023-03-31

Contexte :
L’équipe SDC du laboratoire ICube (Université de Strasbourg) en collaboration avec le CNES propose un contrat doctoral sur l’apprentissage de représentation enanalyse de séries temporelles d’images de télédétection.

https://recrutement.cnes.fr/fr/annonce/2035525-23-111-representation-learning-for-geographic-spatiotemporal-generalisation-67400-illkirch-graffenstaden

La date limite de candidature est fixée au 16 mars et doit se faire via le site du CNES (lien donné ci-dessus).

Si vous êtes intéressé, veuillez prendre contact avec nous le plus rapidement possible en envoyant un mail (joindre votre CV, lettre de motivation et relevés de notes, avec si possible vos classements en L3, M1 et éventuellement M2, … ) à lampert@unistra.fr et gancarski@unistra.fr

Sujet :
Titre du thèse : Representation Learning for Geographic Spatio-Temporal Generalisation

Description du sujet : Time-series are becoming prevalent in many fields, particularly when monitoring environmental changes of the Earth’s surface in the long term (climate change, urbanisation, etc), medium term (annual crop cycle, etc) or short term (earthquakes, floods, etc). With the current and future satellite constellations satellite image time-series (SITS) expand remote sensing’s impact. The project’s goal is to develop domain invariant representations using deep learning for SITS analysis. Such methods will enable geographic generalisation, which consists of reusing information from the analysis of one geographic area to analyse others by using, or not, the same sensors, as proposed in [5]. Current approaches work for single images because they generally originate from the computer vision community. The internship will start the evaluation of the state-of-the-art and to implement and extend approaches already developed in ICube [5,6]. Current work on domain adaptation (DA) for time-series uses either weak supervision [1] or attention-based mechanisms [2,3] for classification or focus on the related problem of time-series forecasting [4]. However, none of these approaches tackle the problem of learning DIRs that can be applied to several geographical locations simultaneously. The work has two benefits: on the one hand, to reduce the burden of ground truth collection when sensors of different characteristics are used; and on the other to exploit the information contained in each data modality to learn representations that are more robust and general, i.e. to detect crops, land cover evolution, etc in different countries that exhibit different characteristics. Your contributions will be part of the global work of the SDC researchers and will be validated through the partnership with CNES and potential collaboration with Tour du Valat. SDC’s aim is to propose and implement new generic methods and tools to exploit large sets of reference data from one domain/modality (sufficient to train an accurate detector) to train a multi-modal/domain detector that can be applied to imagery taken from another sensor for which there exists no reference data. As such, the work tackles key problems in many machine learning & computer vision applications.

Profil du candidat :
Master en Informatique ou équivalent.

Formation et compétences requises :
Compétences fortes en machine learning et analyse d’images. Une expérience en apprentissage profond est un plus indéniable.

Adresse d’emploi :
ICube
Université de Strasbourg

Ingénieur deep learning et datascience pour le traitement de données optiques

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : ONERA – Centre de Palaiseau – Département Optique
Durée : CDI
Contact : sidonie.lefebvre@onera.fr
Date limite de publication : 2023-03-31

Contexte :
Présentation de l’ONERA

L’ONERA, acteur central de la recherche aéronautique et spatiale, emploie plus de 2000 personnes. Placé sous la tutelle du ministère des Armées, il dispose d’un budget de 266 millions d’euros (2022), dont plus de la moitié provient de contrats d’études, de recherche et d’essais. Expert étatique, l’ONERA prépare la défense de demain, répond aux enjeux aéronautiques et spatiaux du futur, et contribue à la compétitivité de l’industrie aérospatiale. Il maîtrise toutes les disciplines et technologies du domaine. Tous les grands programmes aérospatiaux civils et militaires en France et en Europe portent une part de l’ADN de l’ONERA : Ariane, Airbus, Falcon, Rafale, missiles, hélicoptères, moteurs, radars… Reconnus à l’international et souvent primés, ses chercheurs forment de nombreux doctorants.

Présentation du département

Le Département Optique et Techniques Associées, DOTA, a pour mission de réaliser des études et recherches liées à l’utilisation du domaine optique (ondes électromagnétiques comprises entre l’ultraviolet moyen (200 nm) et le domaine des THz (1 THz ~ 300 μm). Ces études sont réalisées en premier lieu au profit du domaine Aéronautique, Espace et Défense, mais également pour d’autres domaines comme la sécurité, l’environnement, l’astronomie, et l’imagerie médicale.

Le DOTA a la maîtrise de l’ensemble de la chaîne optique, depuis la source jusqu’aux traitements des signaux issus des systèmes optiques, en vue de réaliser des produits.

Sujet :
Missions

L’Unité Modélisation Physique de la Scène Optronique (MPSO) développe et met en œuvre des outils de référence pour modéliser et caractériser l’environnement optronique pour le dimensionnement et l’évaluation des performances des capteurs terrestres ou embarqués sur avion ou satellite. Cette unité se compose d’une quinzaine de personnes comprenant des ingénieurs de recherche et des doctorants. Elle participe activement au Laboratoire de Mathématiques Appliquées (LMA2S, https://w3.onera.fr/lma2s/) et au Laboratoire d’Intelligence Artificielle de l’ONERA.

Sous la responsabilité du responsable d’unité et en collaboration avec les chercheurs de différentes unités du DOTA, vous développez et mettez en œuvre des méthodologies d’apprentissage profond pour différentes applications, parmi lesquelles :

l’inversion de données instrumentales (lidar, spectromètres…) ou simulées pour la caractérisation de l’environnement et des cibles,
la détection de couvertures nuageuses pour des missions satellitaires, le choix de sites de réception pour les télécommunications optiques, la fusion et le clustering de données météorologiques,
l’estimation, le démélange et la classification de paramètres physiques atmosphériques ou terrains (végétation, minéraux, plastiques…),
la simulation de données par Intelligence Artificielle, …

Pour toutes ces applications, un point important est la quantification des incertitudes associées à l’utilisation des méthodes de Deep Learning, en lien avec le Groupement d’Intérêt Scientifique LARTISSTE (https://uq-at-paris-saclay.github.io/).

Vous contribuez au développement des activités du département en étant force de propositions dans l’élaboration de projets scientifiques (Union Européenne, EDA…), en initiant des collaborations avec les équipes universitaires spécialistes du domaine du Deep Learning et de la Datascience, avec les partenaires industriels de l’ONERA et en participant aux activités des laboratoires de mathématiques appliquées et d’intelligence artificielle.

Vous assurez également des activités d’encadrement de stagiaires, doctorants ou post-doctorants et valorisez vos travaux dans des journaux à comité de lecture et des conférences ;

Vos missions sont conditionnées par l’obtention d’une habilitation de Défense nationale.

Profil du candidat :
Docteur ou ingénieur possédant des compétences solides en intelligence artificielle et machine learning

Intérêt prononcé pour la recherche appliquée et la mise au point de nouveaux algorithmes

Formation et compétences requises :
Maîtrise avérée du langage Python

Expérience réussie dans le traitement de gros volumes de données

Bonnes qualités rédactionnelles et d’organisation nécessaires

Bon niveau d’anglais indispensable

Adresse d’emploi :
ONERA – Centre de Palaiseau
6 Chemin de la Vauve aux Granges
91120 PALAISEAU

Pour postuler, utiliser le lien direct : https://emea3.recruitmentplatform.com/apply-app/pages/application-form?jobId=Q6EFK026203F3VBQB68LOF6FJ-3083&langCode=fr_FR

E-health & Ethics: Research Day

Date : 2023-04-13
Lieu : Pôle Universitaire Léonard de Vinci
Paris La Défense

An increasing trend towards the implementation of digital technologies within the health care systems has been witnessed during the last 20 years. In its global strategy on digital health for the years 2020–2025, the World Health Organization (WHO) devoted a specific interest to the role played by digital devices in allowing a larger and more equitable access to health services to all categories of populations, without any distinction regarding their economic, geopolitical, social or demographic specificities.

The term e-health encompasses in its broader sense a large array of health care domains supported or enabled by technology. Building on Marent & Henwood (2021), we can consider a typology including : (1) telemedicine: synchronous or asynchronous care at a distance, possibly enabled by sociotechnical platforms; (2) health information: storage, processing, search and exchange, through information and data management systems ; (3) mHealth: use of mobile and connected devices for health-related reasons; and (4) algorithmic health: incorporating advances in data science and artificial intelligence (AI) in health care for experimental, predictive, curative, or diagnostic purposes.

Despite the benefits and considerable advancements made possible by implementing digital devices in health and health care, crucial ethical questions have been raised. Bioethics and deontological perspectives cross paths with all the issues related to Information Technology uses and their implications on people’s life (privacy, digital divide, reluctance towards AI). Two value systems confront each other and concurrently foster a wide range of issues embracing different perspectives (philosophical, moral, normative, technical, managerial or legal).

The aim of this research day is to curate and compare the views of social, human and management sciences and engineering sciences in order to shed light on these issues.

Thematic axes

The expected communications could address the following themes (non-exhaustive list):

Ethical issues related to new e-health business models: health platforms, Health Tech, privacy models
M-health and connected health: usage, adoption, resistance and effects on well-being, patient empowerment, self-tracking and self-care metrics efficiency, security hardware devices.
E-health and privacy: advances and limitations of the jurisdictional arsenal, privacy management systems, privacy-by-design
E-health and health systems restructuration (public and private sectors)
Artificial intelligence and e-health, cognitive science, pattern mining, AI/DL models & health/law, ethical & responsible AI.
Healthcare metaverse: challenges and ethical boundaries
Sociomateriality of digital health

Submission guidelines

Authors are invited to submit their extended abstracts (800 – 1000 words, up to 10 references) electronically via easychair: https://easychair.org/conferences/?conf=ehealthethics23

Language

English or French

Schedule

Paper submission: 28 February 2023

Notification of acceptance: 15 March 2023

Research day: 13 April 2023

In Pôle Universitaire Léonard de Vinci, Arche Campus, Paris La Défense.

https://conferences.dvrc.fr/eHealth-ethics23/

Publication:

The proceedings will be published in a white Paper. Selected papers by the scientific committee will also be suggested for publication in The Conversation https://theconversation.com/fr

Programme Committee

Tristan ALLARD, Université de Rennes, CNRS, IRISA
Mirian ASFELD FERRARI – LIFO Lab, University of Orléans
Pascale BUENO MERINO, EMLV, Paris la Défense
Raffaele FILIERI, AUDENCIA Business School
Samuel FOSSO WAMBA, Toulouse Business School
Thomas GUYET, INRIA
Antoine HARFOUCHE, Université de Nanterre
Jean-Etienne JOULLIE, EMLV, Paris la Défense
Benjamin NGUYEN, INSA
Suprateek SARKER, McIntire School of Commerce, University of Virginia, USA
Francesco SCHIAVONE, University Parthenope, Naples, Italy
Nour UL AIN, EMLV, Paris la Défense

Organizing Committee

Michèle KANHOUNOU, ESILV Paris la Défense
Hajer KEFI, EMLV Paris la Défense
Insaf KHELLADI, EMLV Paris la Défense
Clara MANTEY, ILV Paris la Défense
Nicolas TRAVERS, ESILV Paris la Défense

For any questions regarding the research day, please contact us at the following email address: ehealth.ethics@devinci.onmicrosoft.com

Lien direct


Notre site web : www.madics.fr
Suivez-nous sur Tweeter : @GDR_MADICS
Pour vous désabonner de la liste, suivre ce lien.

Modeling temporal, rhythmic and social synchronization with spike neural networks

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : Euromov DHM
Durée : 3 ans
Contact : patrice.guyot@mines-ales.fr
Date limite de publication : 2023-09-02

Contexte :
A 3-year fully funded PhD scholarship is proposed by the PhD school (ED I2S) in Alès / Montpellier within the ANR MODPULS project.
The successful applicant will become part of a dynamic research environment within the newly multidisciplinary joint research center EuroMov Digital Health in Motion.

See this offer on the EuroMov DHM website:
https://dhm.euromov.eu/wp-content/uploads/2021/06/Ph.D_MovementMusicSync.pdf

Start date: October 1st, 2023 (to September 2027).
Net remuneration around 1630€ monthly (including social security and health benefits).

A 6-month internship is also possible on the same project (March to August 2023). See this offer on the EuroMov DHM website: https://dhm.euromov.eu/wp-content/uploads/2022/12/M2_Modpuls.pdf

Sujet :
The temporality of information is crucial to our understanding of the world. Synchronization between different events guides our perception and our actions in many tasks. For example, speech understanding is improved by lip-reading in a context of synchronization between visual and sound perception.
In the field of artificial intelligence, spike neural networks offer a paradigm inspired by the functioning of the human brain, which is based on the synchronization between neuronal impulses. These neural networks are likely to be more efficient than the classical neural networks used in the field of machine learning, and less costly in terms of hardware. They also offer new possibilities for processing temporal data and analyzing synchronizations.
The MODPULS project aims at studying the possibilities and the limits of the use of spike neural networks for the analysis of temporal data related to synchronization, rhythm, and human movement. Through a set of temporal and rhythmic data of different natures and complexities, combining audio, video and human motion data, you will have to implements synchronization tasks with spike neural networks. The fine analysis of synchronization mechanisms opens the field to numerous applications, notably in the human sciences with musical practice, but also in the medical field through the therapeutic analysis of social synchronizations.

Profil du candidat :
Applicants should have (or anticipate having) a MSc and research background related to computer science, audio/signal processing, or computational movement science.

Formation et compétences requises :
Knowledge in music (theoretical and practical) will be valued. French is not mandatory, but the candidate must be willing to learn French during their PhD and they must be able to communicate in English.

Adresse d’emploi :
Ales ou Montpellier

Document attaché : 202302091411_Ph.D_Modpuls_Internship.pdf

Question Answering With Open Knowledge Bases

Offre en lien avec l’Action/le Réseau : DOING/– — –

Laboratoire/Entreprise : SAMOVAR – Télécom SudParis
Durée : 6 mois
Contact : romerojulien34@gmail.com
Date limite de publication : 2023-09-02

Contexte :
Given a text, it is possible to extract from it knowledge in the form of subject-predicate-object triples, where all components of the triples can be found in the text. This is called Open Information Extraction (OpenIE). For example, from the sentence “The fish swims happily in the ocean”, we can extract the triple (fish, swims, in the ocean). By gathering many of these statements, we obtain an Open Knowledge Base (OpenKB), with no constraints on the subjects, the predicates, and the objects.

Then, this OpenKB could be used for question answering (QA). There have been many approaches that target QA over non-open KBs. These approaches vary from crafting query templates that, once filled in, will be used to query the KB, to neural models, where the goal is to represent the question and the possible answers as latent vectors, where the correct answer should be close in the embedding space to the question~cite{bordes2014question}. In this project, we will focus on neural models, particularly knowledge graph embeddings, i.e., continuous representations for the entities and relations that can generally capture relevant information about the graph’s structure.

The current way KB embeddings are computed raises two main challenges:
* Each entity and relation must be seen enough times during training so the system can learn relevant embeddings. The training is done taking edges information into account, so the entity or relation must be part of a sufficiently large number of edges.
* The textual representation of the verbal and noun phrases of the relations, subjects, and objects should be considered.

For example, a recent approach, MHGRN, computes embeddings by using a modified graph neural network architecture. This architecture, however, does not take into account the textual representation of relations.
A better approach is CARE, that relies on two main ideas. First, it clusters the subjects and objects and creates an unlabelled edge between entities in the same cluster. That partially reduces the problem of the entities connected to a small number of edges, by leveraging the connection with better connected entities. Then, it computes embeddings for the relations using GLOVE (word embeddings) and GRUs (recurrent neural networks). We believe that the approach in CARE could be improved by considering more modern neural architectures using message-passing algorithms and integrating the textual representation of predicates, objects, and subjects. In addition, we will investigate if the clustering step is necessary, as it can bring a bias for one important downstream application of KB embeddings: canonicalization, the task of finding a representative for a set of nodes or edges.

In this project, we will improve open KB embedding methods by:
* Exploring state-of-the-art neural architectures and language models.
* Integrating textual representations of the subject, predicate, and object.
* Investigating if clustering before embedding computation is necessary.
* Integrating embeddings into question-answering models.

Sujet :
Given a text, it is possible to extract from it knowledge in the form of subject-predicate-object triples, where all components of the triples can be found in the text. This is called Open Information Extraction (OpenIE). For example, from the sentence “The fish swims happily in the ocean”, we can extract the triple (fish, swims, in the ocean). By gathering many of these statements, we obtain an Open Knowledge Base (OpenKB), with no constraints on the subjects, the predicates, and the objects.

Then, this OpenKB could be used for question answering (QA). There have been many approaches that target QA over non-open KBs. These approaches vary from crafting query templates that, once filled in, will be used to query the KB, to neural models, where the goal is to represent the question and the possible answers as latent vectors, where the correct answer should be close in the embedding space to the question~cite{bordes2014question}. In this project, we will focus on neural models, particularly knowledge graph embeddings, i.e., continuous representations for the entities and relations that can generally capture relevant information about the graph’s structure.

The current way KB embeddings are computed raises two main challenges:
* Each entity and relation must be seen enough times during training so the system can learn relevant embeddings. The training is done taking edges information into account, so the entity or relation must be part of a sufficiently large number of edges.
* The textual representation of the verbal and noun phrases of the relations, subjects, and objects should be considered.

For example, a recent approach, MHGRN, computes embeddings by using a modified graph neural network architecture. This architecture, however, does not take into account the textual representation of relations.
A better approach is CARE, that relies on two main ideas. First, it clusters the subjects and objects and creates an unlabelled edge between entities in the same cluster. That partially reduces the problem of the entities connected to a small number of edges, by leveraging the connection with better connected entities. Then, it computes embeddings for the relations using GLOVE (word embeddings) and GRUs (recurrent neural networks). We believe that the approach in CARE could be improved by considering more modern neural architectures using message-passing algorithms and integrating the textual representation of predicates, objects, and subjects. In addition, we will investigate if the clustering step is necessary, as it can bring a bias for one important downstream application of KB embeddings: canonicalization, the task of finding a representative for a set of nodes or edges.

In this project, we will improve open KB embedding methods by:
* Exploring state-of-the-art neural architectures and language models.
* Integrating textual representations of the subject, predicate, and object.
* Investigating if clustering before embedding computation is necessary.
* Integrating embeddings into question-answering models.

Profil du candidat :
The intern should be involved in a master’s program and have a good knowledge of machine learning, deep learning, natural language processing, and graphs. A good understanding of Python and the standard libraries used in data science (scikit-learn, PyTorch, pandas, transformers) is also expected. In addition, a previous experience with graph neural networks would be appreciated.

Formation et compétences requises :
The intern should be involved in a master’s program and have a good knowledge of machine learning, deep learning, natural language processing, and graphs. A good understanding of Python and the standard libraries used in data science (scikit-learn, PyTorch, pandas, transformers) is also expected. In addition, a previous experience with graph neural networks would be appreciated.

Adresse d’emploi :
Palaiseau

Document attaché : 202302091340_internship_openie-1.pdf

Unlocking the Power of Data Dependencies in Data Pipelines

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : LAMSADE – PSL Research University – Universit{
Durée : 4 à 6 mois
Contact : maude.manouvrier@lamsade.dauphine.fr
Date limite de publication : 2023-02-09

Contexte :
{Data dependencies : relationships or connections between different variables in a dataset. Understanding these dependencies is crucial and has a number of applications.

{Data profiling for Machine Learning: Understanding data dependencies is critical for creating accurate and effective machine learning models. The quality of the input data has a direct impact on the accuracy of the model, and understanding data dependencies helps ensure that the data is suitable for use in machine learning.

Data mining: Data dependencies can help you identify patterns and relationships in the data that may not be immediately obvious. These patterns can be used to make predictions and classify data, making it useful in various data mining tasks such as association rule mining and clustering.

Sujet :
This internship will build upon the recent research in data dependency mining in dynamic settings. As a member of a dynamic team, the student will be exploring innovative ways to compute data dependencies in situations where the data is transformed through a data preparation pipeline. The goal is to assess the impact of this preparation process on the dependencies within the data, as well as its overall quality.

The subject of data dependencies is a critical and fascinating aspect of machine learning and AI, providing students with the opportunity to gain practical skills and explore cutting-edge technologies that are shaping the future of the field. The demand for professionals with skills in machine learning and AI is growing rapidly, and understanding data dependencies is a valuable skill for anyone looking to build a career in this field in both academia and industry. On this point, it is worth noting that the internship is likely to lead to a PhD on a related topic.

Profil du candidat :
We seek for excellent and highly motivated student with a background in Computer Science
having good knowledge of database theory and good programming skills (Python or Java).

Please send the following material in a single PDF document before February 20th, 2023:
– fully detailed CV,
– academic records (master’s degree or equivalent),
– recommendation(s) and supporting letter(s).

Formation et compétences requises :
Background in Computer Science
Good knowledge of database theory and good programming skills (Python or Java).

Adresse d’emploi :
LAMSADE – PSL Research University – Universit{‘e} Paris-Dauphine, Paris, France

Document attaché : 202302091126_IntershipLamsadeDataDependencieInPiplines.pdf

PhD (CIFRE contract) at IRISA/Atermes on Object detection from few multispectral examples

Offre en lien avec l’Action/le Réseau : – — –/Doctorants

Laboratoire/Entreprise : IRISA/ATERMES
Durée : 3 years
Contact : minh-tan.pham@irisa.fr
Date limite de publication : 2023-05-30

Contexte :
ATERMES is an international mid-sized company, based in Montigny-le-Bretonneux with a strong expertise in high technology and system integration from the upstream design to the long-life maintenance cycle. It specializes in offering system solution for border surveillance. Its flagship product BARIER™ (“Beacon Autonomous Reconnaissance Identification and Evaluation Response”) provides ready application for temporary strategic site protection or ill-defined border regions in mountainous or remote terrain where fixed surveillance modes are impracticable or overly expensive to deploy. As another exemple, SURICATE is the first of its class optronic ground “RADAR” that covers very efficiently wide field with automatic classification of intruders thanks to multi-spectral deep learning detection.

The collaboration between ATERMES and IRISA was initiated through a first PhD thesis (Heng Zhang, defended December 2021, https://www.theses.fr/2021REN1S099/document). This successful collaboration led to multiple contributions on object detection in both mono-modal (RGB) and multi-modal (RGB+THERMAL) scenarios. Besides, this study allowed to identify remaining challenges that need to be solved to ensure multispectral object detection in the wild.

Sujet :
The project aims at providing deep learning-based methods to detect objects in outdoor environments using multispectral data in a low supervision context, e.g., learning from few examples to detect scarcely-observed objects. The data consist of RGB and IR (Infra-red) images which are frames from calibrated and aligned multispectral videos.
Few-shot learning [1][2], active learning [3] and incremental/continual learning [4][5] are among the frameworks to be investigated since they allow to limit the number of labeled examples needed for learning. Most developed methods [6][7][8][9] based on these approaches have been proposed to perform object detection from RGB images within different weakly-supervised scenarios. They should be adapted and improved to deal with scarce object detection from multispectral images.In case of lacking objects of interest during the training, anomaly detection approaches [10][11] can be also considered to detect new object classes which will be further characterized by prior semantic concepts.
In addition to the (private) data from ATERMES, the PhD candidate will be able to work with public benchmarks such as KAIST, FLIR, VEDAI or MIL to benchmark the developed frameworks in the vision and machine learning communities.

Profil du candidat :
MSc or Engineering degree with excellent academic track and proven research experience in the following fields: computer science, applied maths, signal processing and computer vision;

Formation et compétences requises :
Experience with machine learning, in particular deep learning;

Skills and proved experience in programming (Python and frameworks such as Pytorch/Tensorflow will be appreciated);

Good communication skills (spoken/written English) is required ;

Adresse d’emploi :
The PhD candidate will work part time (80%) at IRISA (with 1 day per week in Rennes and the rest of the time in the Vannes IRISA facility) and part time (20%) in ATERMES in Paris (which corresponds to 2 days every 2 weeks). The exact schedule will be flexible: it might be preferable to spend more time in the company at the beginning of the thesis to learn about the system and understand the data and be full time in the lab while writing the PhD dissertation.

Document attaché : 202302091035_PHD_IRISA_Atermes_2023.pdf

Postdoc sur l’inférence de réseaux de gènes

Offre en lien avec l’Action/le Réseau : – — –/Doctorants

Laboratoire/Entreprise : INRAE
Durée : 18 mois
Contact : nathalie.vialaneix@inrae.fr
Date limite de publication : 2023-05-30

Contexte :
L’Unité de Mathématiques et Informatique Appliquées de Toulouse https://mia.toulouse.inra.fr est une unité propre (UR875) d’INRAE https://www.inrae.fr. MIAT a pour mission scientifique de développer et mettre en œuvre des méthodes mathématiques et/ou informatiques pertinentes pour résoudre des problèmes identifiés avec nos collaborateurs qui sont issus principalement d’autres départements d’INRAE. L’unité comporte actuellement deux équipes de recherche (SciDyn et SaAB) et trois équipes de service (Plateformes BIOINFO, RECORD et SIGENAE).

Sujet :
Les missions du (de la) postdoc recruté·e se dérouleront dans le cadre du projet SubtilNet fédérant les compétences informatiques / mathématiques / statistiques de l’équipe SaAB sur l’inférence de réseaux et la biologie des systèmes. Ce projet se positionne sur l’étude des méthodes mathématiques permettant de reconstruire des réseaux biologiques. Il a pour ambition, en se basant sur un réseau réel exhaustif de la bactérie Bacillus subtilis, de mieux évaluer les méthodes d’inférence actuelles et leurs caractéristiques. Le but final est d’améliorer l’état de l’art en termes de méthodes d’inférence en se rapprochant de la réalité biologique et en intégrant, dans les modèles, des informations biologiques pertinentes.

Profil du candidat :
Doctorat en informatique, bioinformatique, machine learning ou statistique (thèse soutenue depuis moins de 3 ans).

Formation et compétences requises :
Nous recherchons un·e candidat·e ayant une expérience avérée en analyses de données omiques, biologie cellulaire et/ou biologie des systèmes. Des compétences sur l’inférence ou l’analyse de réseaux sont également souhaitables ou, à défaut, des compétences en apprentissage automatique ou statistique. Enfin, un bon niveau en programmation, de préférence avec le langage de programmation R, est requis. Une connaissance de Python, Matlab, … serait un plus.
Compte tenu des nécessaires interactions entre les divers membres du projet, une aptitude au travail en groupe serait appréciée. Le (la) candidat·e doit également posséder un très bon niveau d’anglais scientifique.

Adresse d’emploi :
INRAE Toulouse

Document attaché : 202302090814_Recrutement_postdoc_subtilnet.pdf