Postdoc on “Discourse Segmentation and Parsing of Spoken Conversations”

14/02/2022 – 15/02/2022 all-day

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : Laboratoire Parole et Langage (UMR7309)
Durée : 24 mois
Contact :
Date limite de publication : 2022-02-14

Contexte :
The long-term goal of SUMM-RE is to improve algorithms for automatic meeting summarization and meeting minutes. The central hypothesis of the project is that such systems will benefit greatly from exploiting rich information carried by discourse relations (Explanations, Questions/Answers, Corrections…) and discourse structure (in the form of graphs). One of the major objectives of the project is therefore to develop an incremental discourse parser for spontaneous conversation, building on extant work by SUMM-RE members using weak supervision (Badene et al. 2019). Discourse parsing will be done on English (the AMI corpus) and French data, but the principal focus will be on a 100h corpus of meetings in French whose creation will be completed by the time the postdoc starts.


Sujet :
The postdoc recruited for this position will be in charge of (i) adapting models of discourse segmentation (e.g. Muller et al. 2019) to meeting-style conversation by building on recent advances with weak supervision (Gravellier et al. 2021) and integrating both speech and acoustic parameters in the segmentation model; (ii) applying insights from discourse segmentation, which provides the foundation for discourse parsing, to improve the incremental discourse parser; (iii) considering and developing mitigation strategies for working directly on ASR output (rather than on gold human transcribed data) for both discourse segmentation and parsing.

Profil du candidat :
Given these tasks we are looking for a candidates with as many of the following skills as possible:

– Experience with speech and ASR, and conversational speech in particular

– Dialogue/conversation/interaction analysis and modeling

– Machine Learning, in particular Weakly Supervised and Unsupervised approaches

– Multimodal (speech + text) Deep Representations for Natural Language Processing

– Multilingual model transfer

Formation et compétences requises :
Phd in Computational Linguistics / Natual Language Processing / Machine Learning.

A minimal command of French is desirable as the postdoc will be required to handle a large French corpus; mastery of French is, however, not required.

Adresse d’emploi :
The postdoc will ideally be hosted by the Laboratoire Parole et Langage (LPL), though exceptions will be considered for candidates who wish to be based at IRIT.

A curriculum vitae and a list of publications should be sent to Laurent Prévot ( no later than February 18th, but we strongly encourage potential candidates to submit their applications as soon as possible, as we might fill the position earlier.

Laboratoire Parole et Langage: