Détection d’anomalies en apprentissage machine

Offre en lien avec l’Action/le Réseau : – — –/Doctorants

Laboratoire/Entreprise : UTT/LIST3N
Durée : 3 ans
Contact : alexandre.baussard@utt.fr
Date limite de publication : 2022-08-20

Contexte :
L’apprentissage machine et plus particulièrement l’apprentissage profond (deep learning) permettent d’obtenir des performances très élevées lorsqu’on cherche par exemple à détecter et reconnaitre des objets ou encore à classifier des zones d’intérêt dans des images ou des vidéos. Cependant, en utilisation réelle, il faut décider si une nouvelle observation appartient à la même distribution que les observations existantes (utilisées lors de l’apprentissage), ou si elle doit être considérée comme différentes. Ce type de distinctions peut intervenir à deux niveaux selon les contextes. Dans un premier cas les données d’apprentissage contiennent des observations aberrantes qui sont définies comme des observations éloignées des autres. Les estimateurs de détection des aberrations tentent donc d’ajuster les régions où les données d’apprentissage sont les plus concentrées, en ignorant les observations déviantes. Dans le second cas, les données d’apprentissage ne sont pas polluées par des valeurs aberrantes, mais ces dernières peuvent survenir lors de la phase de te. Dans ce cas, nous sommes intéressés à adjoindre aux méthodes de reconnaissance une aptitude à écarter les nouvelles observations aberrantes. Nous sommes donc intéressés à détecter si une nouvelle observation est une valeur aberrante. Il s’agit notamment d’éviter que le système prenne une décision, à tort, avec une grande confiance. Dans ce contexte, la détection d’une observation aberrante peut avoir différents intérêts car elle pourrait par exemple être liée à une information pertinente jamais rencontrée ou non apprise jusqu’ici. Il apparaît donc important de pouvoir détecter dans un premier temps ces anomalies et, dans un deuxième temps, d’essayer de les exploiter pour mettre en évidence d’éventuelles nouvelles données utiles.

Sujet :
Dans le cadre de ce projet, nous allons nous focaliser sur le deuxième cas, à savoir la détection d’anomalies en condition d’utilisation réelle. Notre objectif en développant ces méthodes de détection est double. Il s’agit d’éviter les erreurs et de progresser vers une meilleure compréhension du processus de prise de décision par ces systèmes souvent considérés comme des « boîtes noires », dont le fonctionnement interne n’est pas explicable. Cela pourra aussi contribuer à caractériser les éléments conduisant à la prise de décision, par exemple via un niveau de confiance dans la décision.

Profil du candidat :
Le candidat recherché est de niveau master ou équivalent avec des compétences en mathématiques appliquées, programmation (python), traitement de l’information, analyse de données.

Formation et compétences requises :
Une première expérience dans le domaine de l’apprentissage machine (notamment deep learning) et en programmation avec TensorFlow ou Pytorch seront un plus.

Adresse d’emploi :
Université de Technologie de Troyes

PhD AI-Powered Reliable and Available Wireless Mesh Networks for the Factory of the Future F/M

Offre en lien avec l’Action/le Réseau : – — –/Doctorants

Laboratoire/Entreprise : Orange Labs / ICube
Durée : 36M
Contact : fabrice.theoleyre@cnrs.fr
Date limite de publication : 2022-07-15

Contexte :
You will participate to experiment-based research, developing prototypes to assess the performance of your ideas in realistic environments, with concrete scientific productions. You will have the opportunity to run experiments on large-scale testbeds (with hundreds of devices). A participation to the IETF is also expected, with concrete propositions and possibilities to push ideas to standards, through the novel RAW working group.

You will be involved in an exciting environment, with several key French academic and industrial players in the Internet of Things. In particular, you will be an active participant of the future ANR CONNECT project, expected to bootstrap in 2022.

You will also be integrated in the Network research group at ICube, where several researchers have a strong experience in Internet of Things, and Internet in general. The group hosts also one part of the large-scale FIT IoT-Lab platform and you will benefit from the strong skills in experimental research and reproducibility of the group.

Orange Innovation brings together the research and innovation activities and expertise of the Group’s entities and countries. We work every day to ensure that Orange is recognized as an innovative operator by its customers and we create value for the Group and the Brand in each of our projects. With 740 researchers, thousands of marketers, developers, designers and data analysts, it is the expertise of our 6,000 employees that fuels this ambition every day.

Orange Innovation anticipates technological breakthroughs and supports the Group’s countries and entities in making the best technological choices to meet the needs of our consumer and business customers.

Within Innovation, you will join a research team in the department « Machine To Machine, Internet of Things and Smart Cities” specialized in IoT connectivity technologies. The team has about fifteen engineers and researchers and also hosts doctoral and post-doctoral students working on various cutting edge topics such as 6G physical layer design, Artificial Intelligence and communication protocols for the IoT like 802.15.4 TSCH.

Sujet :
Your role is to carry out a thesis work on “AI at the service of Reliable and Available Wireless Mesh Networks for the Factory of the Future”.

The industry is amid an in-depth transformation with the pervasive integration of sensors and actuators in the manufacturing process. So-called Industry 4.0 involves the agile combination of reliable process monitoring, data analysis and timely operational adaptation of production lines and Industrial Internet of Things (IIoT) networks, such as 5G-URLLC and IEEE 802.15.4 networks, are critical enablers to this transformation.
The later IIoT networks operate on license-free frequency bands and allow for low-power and low-cost device implementations. However, achieving latency and delivery requirements of Industry 4.0 use-cases with state-of-the-art IEEE 802.15.4 networks is still an open challenge, largely due to interference and harsh radio propagation environments.

Novel enablers at the physical layer – such as IEEE 802.15.4g radio waveforms and modulations – or at the MAC layer, i.e. IEEE 802.15.4e TSCH, are stepping stones to bridge the gap between IIoT networks capabilities on unlicensed spectrum and Industry 4.0 requirements. The new radio waveforms and modulation offer a wide range of range and bit-rate vs link budget operating points, allowing the adaptation of data-rate to link quality, while Time Slotted Channel Hopping (TSCH) mechanisms and the IETF 6TOP protocol lay the basis for a centralized orchestration of the network, enabling time-sensitive, high-availability uses-cases.

In this context, the main objective of this thesis is to define a complete toolbox allowing to orchestrate the radio communications in a wireless mesh network through a combination of centralized and distributed decision making based on Reinforcement Learning (RL) algorithms, in order to meet the reliability and latency requirements for the FoF applications.
In order to achieve this goal, you will study RL-based resource allocation and scheduling algorithms and their application to wireless mesh networks. Specifically, DQN (Deep Q Learning) algorithms for centralized long-term resource allocation, and MAB (Multi-Armed Bandit) algorithms for connectivity restoration in case of connectivity topology change, and for continuous optimization to accommodate possible variations.
The main challenges to be addressed are the modeling of endogenous/exogenous interference in a mesh network, the establishment of a constrained schedule (half-duplex radios, delay, energy consumption, etc.) and the restoration of the connectivity under constraints (respect of deadlines and delivery rate).

The main expected achievements are the design algorithms allowing the calculation of communication schedules in multi-hop networks and the establishment of backup routes in case of transmission failure according to the calculated schedule, and their integration in a PCE (Path Computation Element) network controller and demonstrator.

Profil du candidat :
You have a Master’s degree in Computer Science or Data Science.

You are creative and innovative, have good interpersonal skills and a high motivation for research. Curiosity, critical thinking, open-mindedness, autonomy, and ability to organize one’s work according to the objectives to be reached are qualities particularly appreciated for research work. Dynamism, proactiveness and communication skills are also qualities that would be appreciated. You want to transform your ideas in concrete prototypes, and to play with large-scale experiments.

Formation et compétences requises :
It is required to have some experience and in-depth knowledge of wireless networks, and to be familiar with reinforcement learning techniques. Skills in low-power radio technologies would be a plus.

You have good programming skills (C, Python) and a previous experience in embedded development, preferably on a board including a radio circuit.

Excellent level of English is mandatory. Conversational French is also desirable.

Adresse d’emploi :
Orange Labs Meylan, with frequent visits at ICube, Strasbourg

Journée MACLEAN @ CAp/RFIAP (Vannes, 5 juillet 2022)

Annonce en lien avec l’Action/le Réseau : MACLEAN

Thème :

Machine learning and computer vision in earth observation: scientific results versus industrial needs

Présentation :

In the framework of the CAp and RFIAP conferences, a workshop on machine learning and computer vision issues in the context of earth observation is planned for Tuesday 5 July. This workshop will be organised with the support of the MACLEAN action of the GDR MADICS, which aims to bring together the environmental and data science communities. More precisely, the objective of the day will be to cross-reference the needs and expectations of industrialists in the field with the work of academic research laboratories. In doing so, the workshop will aim to raise awareness of the potential of the latest academic scientific developments, to confront them with industrial realities, but also to identify scientific issues that companies are facing and for which research work needs to be undertaken.

The day will consist of invited presentations (academic and industrial), round tables, and a poster/demonstration session.

Du : 2022-07-05

Au : 2022-07-05

Lieu : Vannes

Site Web : https://caprfiap2022.sciencesconf.org/page/maclean

Open-source software developer position for large scale continental surface monitoring

Offre en lien avec l’Action/le Réseau : MACLEAN/– — –

Laboratoire/Entreprise : CESBIO
Durée : 12 months
Contact : mathieu.fauvel@inrae.fr
Date limite de publication : 2022-06-20

Contexte :
Satellite remote sensing is an active field of research, with application in environmental, agricultural and climate change science. Several satellite missions have been launched in the last decades, such as the European Copernicus program, and provide massive Earth observation open access data.
Among the various products obtained from these missions, large scale land cover mapping is surely the most operational. Nowadays, such mapping is used for a large range of environmental applications and of primarily importance in the context of climate change. Several open and pri- vate achievements were announced recently (e.g., https://www.theia-land.fr/en/ceslist/land- cover-sec/ or https://viewer.esa-worldcover.org/worldcover) but this topic is still an impor- tant active field of research and engineering. One of the main limiting problems is the ability to efficiently process the very large amount of data that researchers and engineers are faced with.
In this context, the open-source software iota2 (source: https://framagit.org/iota2-project/ iota2, documentation: https://docs.iota2.net/develop/) has been initiated by the CESBIO-Lab as a generic processing chain to fully process recent satellite time series, such as SENTINEL-1 and SENTINEL-2 or Landsat-8. It allowed to produce the first map of the land cover over the metropolitan French territory (e.g., https://theia.cnes.fr/atdistrib/rocket/#/collections/ OSO/21b3e29b-d6de-5d3b-9a45-6068b9cfe77a).
To extend the development of the software, outside of the CESBIO lab, the PARCELLE project was set up to foster the applicability of iota2 to other large scale mapping problems. Three main topics are considered in the project.
1. A quantitative and qualitative assessment of the performances of iota2 for different types of landscapes (e.g., South-Africa or South-America) and/or different land cover types.
2. The methodological integration of state-of-the-art algorithms from the project partners.
3. Promote the usage of iota2 trough training and scientific meeting.
Ultimately, the improvements of the chain will be used to enrich several Centre d’Expertise Scientifique (CES) of the national data center Theia

Sujet :
The first mission of the recruit is to work on the development of new features for iota2, such as deep learning algorithms applied at large scale (super-resolution, classification, inversion . . . ). Appli- cant could check the project repository for more details (https://framagit.org/iota2-project/ iota2/-/issues).
The second mission is related to give training for others members of the project and institutional users (e.g. https://docs.iota2.net/formation/ and its repository https://gitlab.cesbio.omp. eu/fauvelm/formation-iota2). Also, some times will be devoted to answer users questions (mainly trough the issues interface of the gitlab repository).
The third mission of the recruit will be to coordinate the different developments carried out by the partners. As such, other issues may emerge during the project.

Profil du candidat :
The applicant must have a solid background in python (numpy, pandas, scikit learn, pytorch), sci- entific computing, linux, and distributed version control system (git). Experience in software doc- umentation (docstrings, sphinx) will be appreciated, as well as some knowledge in remote sensing image processing, geomatics, geographic information systems and data bases.
The applicant should send a detailed CV, motivation letter, reference letters and, if possible, links to developed software to the contacts.

Formation et compétences requises :
The applicant must have a solid background in python (numpy, pandas, scikit learn, pytorch), sci- entific computing, linux, and distributed version control system (git). Experience in software doc- umentation (docstrings, sphinx) will be appreciated, as well as some knowledge in remote sensing image processing, geomatics, geographic information systems and data bases.
The applicant should send a detailed CV, motivation letter, reference letters and, if possible, links to developed software to the contacts.

Adresse d’emploi :
CESBIO,
Centre d’Etudes Spatiales de la Biosphère, 31400 Toulouse

Document attaché : 202206081219_cdd_parcelle.pdf

Abductive Reasoning with Minimal Sensing in a Home Environment

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : LIMOS / Mines Saint-Étienne
Durée : 3 ans
Contact : victor.charpenay@emse.fr
Date limite de publication : 2022-07-31

Contexte :
The thesis is equally funding by ANR (Agence Nationale de la Recherche) and elm.leblanc, one of the leading home automation system vendors. One of the main technical challenges in modern home automation is to using Artificial Intelligence (AI) to minimize the energy consumption of technical systems without loss of comfort. For instance, the production of hot water can be optimized by dynamically adapting the temperature of water and the time of use of the boiler based on activities monitored in the home. The general objective of the thesis is to monitor human activities without ubiquitous sensing capabilities.

Sujet :
The domain of research of the thesis is knowledge representation and reasoning, a subfield of AI. Its objective is to evaluate abductive reasoning methods over sensor measurements performed in a home environment. The baseline assumption of the thesis is that only minimal sensing is available in the home, as is the case in most homes today: smart meters provide aggregated values (every hour/day) but no information is available per room. Abductive reasoning is expected to help optimize home automation systems without relying on some ubiquitous sensing apparatus (which raises environmental, technical and privacy-preservation questions). Several abduction mechanisms will be evaluated, including Abductive Logic Programming (for an exhaustive exploration of hypothesis space) and neural-symbolic integration methods (for a probabilistic exploration of hypothesis space).

Profil du candidat :
Candidates are expected to have prior knowledge in AI, especially in computational logics, logic programming and/or Semantic Web technologies. Basic understanding of statistical inference methods and linear programming is also considered relevant.

Candidates whose background is machine learning may apply as well. A cover letter exposing the candidate’s motivation to combine (neural) learning methods with symbolic AI is however expected.

Formation et compétences requises :
Holder of a Master’s degree in computer science or data science. Technical skills required for the thesis include: multi-paradigm programming (Java, Lisp, R, Prolog, …), data modeling (UML, OWL, E/R, BPMN, …), Linux system administration (Bash, SSH, Docker, …).

Adresse d’emploi :
Saint-Étienne (with stays in Paris and/or Lille-Douai)

Document attaché : 202206071402_phd-offer.pdf

Fine-grained, multimodal speech anonymization

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : Inria Nancy & Lille
Durée : 36 mois
Contact : emmanuel.vincent@inria.fr
Date limite de publication : 2022-07-31

Contexte :
This PhD is part of the “Personal data protection” project of PEPR Cybersécurité, which aims to advance privacy preservation technology for various application sectors. It will be co-supervised by Emmanuel Vincent and Marc Tommasi. The PhD student will have the opportunity to spend time in both the Multispeech and Magnet teams, to collaborate with 9 other research teams in France and with the French data protection authority CNIL, and to contribute to the project’s overall goals including the organization of an anonymization challenge.

Sujet :
Large-scale collection, storage, and processing of speech data poses severe privacy threats [1]. Indeed, speech encapsulates a wealth of personal data (e.g., age and gender, ethnic origin, personality traits, health and socio-economic status, etc.) which can be linked to the speaker’s identity via metadata or via automatic speaker recognition. Speech data may also be used for voice spoofing using voice cloning software. With firm backing by privacy legislations such as the European general data protection regulation (GDPR), several initiatives are emerging to develop and evaluate privacy preservation solutions for speech technology. These include voice anonymization methods [2] which aim to conceal the speaker’s voice identity without degrading the utility for downstream tasks, and speaker re-identification attacks [3] which aim to assess the resulting privacy guarantees, e.g., in the scope of the VoicePrivacy challenge series [4].

The first objective of this PhD is to improve the privacy-utility tradeoff by better disentangling speaker identity from other attributes, and better decorrelating the underlying dimensions. Solutions may rely on suitable generative or self-supervised models [5, 6] or on adversarial learning [7]. The resulting privacy guarantees will be evaluated via stronger attackers, e.g., taking metadata into account.

The second objective is to extend the proposed audio-only approach to multimodal speech (audio, facial video, and gestures). Solutions will exploit existing facial anonymization technology [8]. A key difficulty will be to preserve the correlations between modalities, which are essential for training multimodal voice processing systems.

Depending on the PhD student’s skills, additional directions may also be explored, e.g., evaluating the proposed anonymization solutions in the context of federated learning.

[1] A. Nautsch, A. Jimenez, A. Treiber, J. Kolberg, C. Jasserand, E. Kindt, H. Delgado, M. Todisco, M. A. Hmani, M. A. Mtibaa, A. Abdelraheem, A. Abad, F. Teixeira, M. Gomez-Barrero, D. Petrovska, N. Chollet, G. Evans, T. Schneider, J.-F. Bonastre, B. Raj, I. Trancoso, and C. Busch, “Preserving privacy in speaker and speech characterisation,” Computer Speech and Language, vol. 58, pp. 441–480, 2019.

[2] B. M. L. Srivastava, M. Maouche, M. Sahidullah, E. Vincent, A. Bellet, M. Tommasi, N. Tomashenko, X. Wang, and J. Yamagishi, “Privacy and utility of x-vector based speaker anonymization,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, to appear.

[3] B. M. L. Srivastava, N. Vauquier, M. Sahidullah, A. Bellet, M. Tommasi, and E. Vincent, “Evaluating voice conversion-based privacy protection against informed attackers,” in 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2802–2806, 2020.

[4] N. Tomashenko, X. Wang, E. Vincent, J. Patino, B. M. L. Srivastava, P.-G. Noé, A. Nautsch, N. Evans, J. Yamagishi, B. O’Brien, A. Chanclu, J.-F. Bonastre, M. Todisco, and M. Maouche, “The VoicePrivacy 2020 Challenge: Results and findings,” Computer Speech and Language, vol. 74, pp. 101362, 2022.

[5] L. Girin, S. Leglaive, X. Bie, J. Diard, T. Hueber, and X. Alameda-Pineda, “Dynamical variational autoencoders: A comprehensive review,” Now Foundations and Trends, 2021.

[6] A. Baevski, H. Zhou, A. Mohamed, and M. Auli, “wav2vec 2.0: A framework for self-supervised learning of speech representations,” in Advances in Neural Information Processing Systems, pp. 12449–12460, 2020.

[7] B. M. L. Srivastava, A. Bellet, M. Tommasi, and E. Vincent, “Privacy-preserving adversarial representation learning in ASR: Reality or illusion?” in Interspeech, pp. 3700–3704, 2019.

[8] T. Ma, D. Li, W. Wang, and J. Dong, “CFA-Net: Controllable face anonymization network with identity representation manipulation,” arXiv preprint arXiv:2105.11137, 2021.

Profil du candidat :
Strong programming skills in Python/Pytorch.
Prior experience in speech and video processing will be an asset.

Formation et compétences requises :
MSc in computer science, machine learning, or signal processing.

Adresse d’emploi :
https://jobs.inria.fr/public/classic/en/offres/2022-05013

Partitionnement sous contrainte de similarité

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : LIAS, ISAE-ENSMA
Durée : 3 ans
Contact : brice.chardin@ensma.fr
Date limite de publication : 2022-06-24

Contexte :
SRD est un gestionnaire de réseau de distribution d’électricité chargé de gérer, exploiter, entretenir et développer un réseau électrique couvrant 90% de la Vienne. Pour l’optimisation de son réseau et la planification d’investissements, SRD cherche à modéliser le comportement des consommateurs et producteurs qu’il dessert.
Bien que cette modélisation soit principalement basée sur les valeurs historiques de puissance transitant sur le réseau, SRD s’intéresse plus particulièrement à son pouvoir prédictif, c’est-à-dire sa capacité à capturer le comportement futur des éléments considérés.

Sujet :
L’objectif scientifique principal de cette thèse est d’élaborer des techniques de classification permettant d’identifier des groupes d’éléments avec une garantie de dissimilarité maximale entre deux éléments d’un même groupe, et de positionner ce type d’approche par rapport aux algorithmes de partitionnement existants.
Les techniques considérées ici sont basées sur un partitionnement sous contrainte, et plus spécifiquement sous contrainte de dissmilarité intra-cluster maximale. Ce type de partitionnement garantit une certaine proximité entre les membres d’un groupe et leur représentant.

Profil du candidat :
Le candidat devra posséder des connaissances en développement logiciel, systèmes d’information, statistiques et analyse de données.
Un bon niveau en français et en anglais est également nécessaire.

Formation et compétences requises :
Le candidat devra être titulaire d’un master en informatique ou d’un diplôme d’ingénieur.

Adresse d’emploi :
ISAE-ENSMA, 1 avenue Clément Ader, 86360 Chasseneuil-du-Poitou

Document attaché : 202206070936_these_labcom_alienor.pdf

summer school on “Point clouds and change detection in the geosciences”

Date : 2022-06-22
Lieu : Online

On June 22, the University of Rennes 1 and the University of Potsdam organize a full day of presentations to close their summer school on “Point clouds and change detection in the geosciences”. This day is open and free for online attendance.

The detailed program is here.

On line registration here.

An overview of the program:

  • Katharina Anders [DGeo Research Group, Institute of Geography, Heidelberg University] It’s about time… to observe surface dynamics in 4D point clouds
  • Daniel Girardeau-Montaut [CloudCompare project] Presentation of the CloudCompare project and its latest developments
  • Chelsea Scott [Arizona State University] Measuring Change at the Earth’s Surface with Topographic Differencing
  • Antonio Abellan [crealp]
  • Fanny Brun [Univ. Grenoble Alpes, CNRS, IRD, Grenoble INP, IGE] Glacier mass change observations with remote sensing
  • Iris De Gelis [Magellium, IRISA UMR 6074, CNES] Deep learning based 3D point clouds change detection: an application to cliffs dynamics
  • Zan Gojcic [Nvidia] Estimating dense 3D displacement vector fields for point cloud-based landslide monitoring
  • Beth Pratt-Sitaula [UNAVCO] Point clouds in teaching: resources and strategies

With best regards,

P. Leroy (CNRS, University of Rennes 1), D. Lague (CNRS, University of Rennes 1) and B. Bookhagen (University of Potsdam)

Lien direct


Notre site web : www.madics.fr
Suivez-nous sur Tweeter : @GDR_MADICS
Pour vous désabonner de la liste, suivre ce lien.

PhD position on Data Profiling, Protection and Sharing

Offre en lien avec l’Action/le Réseau : RoCED/– — –

Laboratoire/Entreprise : LAMSADE, Université Paris Dauphine
Durée : 3 ans
Contact : kbelhajj@googlemail.com
Date limite de publication : 2022-07-20

Contexte :
The PhD thesis is part of an interdisciplinary project involving another PhD thesis on data governance in the field of management sciences. We anticipate that the interaction between the two doctoral students will lead to interdisciplinary contributions in addition to computer science-focused solutions.

The PhD candidate will work in close collaboration with members of the data science team of the Paris Dauphine University. The problems investigated and solutions developed will be guided and validated within case studies in the fields of health and economics.

Sujet :
We have an opening for a PhD position with the objective to develop new solutions to help data providers who wish to share their data to better understand it, and to choose the best-suited data protection policies. To do so, the PhD Student will be investigating techniques for profiling and linking datasets that would help data providers to gain insight into their data, to estimate its (economic) value, and to choose data protection strategies that go beyond privacy protection to take into account the protection of the data provider’s economic assets.

Profil du candidat :
We seek strongly motivated candidates prepared to dedicate to high quality research. The candidate should have (or be close to obtaining) a Master’s degree or equivalent in computer science or applied mathematics. Starting date September/2022.

The successful candidate will enroll as a PhD student in the Computer Science department of the Paris-Dauphine University (under the co-direction of myself and Prof. Daniela Grigori) and will become a member of the Data Science team of the same university. Paris Dauphine University is located in the city of Paris, and is a member of PSL (Paris Sciences et Lettres).

Formation et compétences requises :
Interested candidates are invited to send the following to khalid.belhajjame@dauphine.fr and
daniela.grigori@lamsade.dauphine.fr

– academic CV
– academic transcripts of BSc and MSc
– one page motivation letter explaining why the candidate is suitable for the position
– contact details of two referees

Adresse d’emploi :
Université Paris Dauphine, Paris

khalid.belhajjame@dauphine.fr
daniela.grigori@lamsade.dauphine.fr

Document attaché : 202206040950_annoce_phd_position.txt

Document Analysis in Legal Marketing

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : Vasa (http://vasa.fr)
Durée : 3 ans
Contact : Jean-sebastien.lefevre@vasa.fr
Date limite de publication : 2022-07-20

Contexte :
Dans le cadre de vos recherches, vous aurez à définir une méthodologie permettant de rendre accessible à la compréhension le contenu des graphiques, tableaux et descriptions de chacun les documents de protection sociale via des algorithmes, puis à la déployer, et ensuite à valider la pertinence de cette méthodologie par une stratégie continue d’A/B testing.

Sujet :
Analyse de documents numérisés ou nativement numériques (pdf) composites (textes, graphiques, tableaux) pour l’extraction d’informations complexes pertinentes.

Profil du candidat :
Vous disposez donc d’un niveau bac +5 en Mathématiques Appliqués, en Traitement et Analyse de données, Machine Learning ou similaire
Vous avez un intérêt marqué pour la digitalisation de l’économie et aux nouvelles technologies.

Envie de développer ses connaissances et compétences en :

• Natural Language Processing (NLP) ;
• Heterogenous Data ;
• Image Analysis ;
• Deep Neural Networks for Document Analysis.

Vous parlez couramment français et anglais. Vous savez vulgariser des notions complexes.

Vous travaillerez avec l’équipe de développeurs et de chercheurs pour transformer vos sujets de recherches en solutions commercialisables.

Formation et compétences requises :
bac +5 en Mathématiques Appliqués, en Traitement et Analyse de données, Machine Learning ou similaire

anglais et français courant

Adresse d’emploi :
PINEY (10)

Document attaché : 202206031624_Offre Thèse CIFRE IA V3.docx