Post-doc position in Computer Vision and NLP for document analysis

15/04/2022 – 16/04/2022 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : L3i / Itesoft
Durée : 24 mois
Contact :
Date limite de publication : 2022-04-15

Contexte :
The work carried out by the candidate will be part of a joint project between the L3i laboratory and the ITESOFT company. This project is funded by the “Plan de Relance” from France and European Union. The goal of this project is to develop some new technologies based on artificial intelligence that are capable of automatically summarising the content of unknown business documents.
The post-doc fellow will be located in the premises of the Laboratory L3i, in La Rochelle, France.
The L3i laboratory, created in 1993 at La Rochelle University brings together researchers in Computer Science and Signal Processing from different faculties. The L3i brings together the skills of its researchers in order to address the issues of digital content enhancement from a systemic perspective. This relies, in particular, on a cross exploitation of interactive applications, content indexing and knowledge representation. The laboratory is structured around three scientific themes (Knowledge Engineering, Content Analysis and Management, Interactivity and Dynamic Systems), centred on the common goal of interactive and intelligent management of digital content.
Itesoft proposes some disruptive technologies in order to increase their client performances and to maximize their competitive advantage through digital automation, while responding to strong regulatory and compliance changes. To this end, Itesoft offers the most efficient automation, document processing and risk detection solutions on the market.

Sujet :
The work of the post-doc fellow will fall within the framework of the area “Document summarization”. The aim is to design an innovative approaches for document block segmentation and blocks’ content summarization in order to deal with unknown documents (which were not available during the training step).
There are many scientific bottlenecks arising from this applicative context, mainly in the field of machine learning and pattern recognition. Document content can have many various layouts and is generally seen as a textual content, or as a purely visual content. We seek to propose some multimodal / complex approaches mixing spatial organization, semantic content and visual representation.
This post-doctoral work will be based on a detailed state of the art of existing approaches, to identify their limits and propose innovative approaches that will help to overcome the bottlenecks mentioned above. To solve these problems, we plan to propose new deep learning-based techniques, with a special focus on designing an architecture which takes advantage of modalities (image, text, spatial organization) in the most effective way possible, in order to enhance the accuracy on classes which are visually similar. This is linked to multi-modality [1, 2] while being able to deal with multi-languages documents.

Profil du candidat :
The candidate, who holds a Ph.D. in the fields of computer science, computer engineering, signal processing, natural language processing or applied mathematics, must have a significant research experience in at least two of the following areas:
• Machine learning
• Pattern recognition
• Computer Vision OR image processing OR Automatic Language Processing (knowledge and/or experience in both domains would be considered a plus for the applicant)

Formation et compétences requises :
The candidate’s skills will include:
• Mastering one or more programming languages (Java, Python, C/C++…)
• Very good teamwork skills, having knowledge or experience of Agile methods would be a plus (the work will be carried out both in conjunction with researchers from the L3i laboratory and the R&D department of the Itesoft company)
• Good scientific writing skills, and fluency in writing and speaking English

Adresse d’emploi :
L3i, La Rochelle, France

Document attaché : 202203300955_ENversion_PostDoc2022_AutomaticSummaryOfDocument_LaRochelle_FRANCE.pdf