Deep learning approaches for rhythm analysis in the context of audio, gestures and movement synchronization

When:
01/05/2020 – 02/05/2020 all-day
2020-05-01T02:00:00+02:00
2020-05-02T02:00:00+02:00

Annonce en lien avec l’Action/le Réseau : aucun

Laboratoire/Entreprise : Research center LGI2P, 7 rue Jules Renard. IMT Mines Alès. 30100 Alès.
Durée : 6 months
Contact : patrice.guyot@mines-ales.fr
Date limite de publication : 2020-05-01

Contexte :
This internship is funded by the LGI2P through the context of the creation of a new research unit, named EuroMov DHM, that will include researchers from Euromov, Montpellier (http://euromov.eu/home) and LGI2P, IMT Mines Ales (http://www.mines-ales.fr). Works will be mainly conducted in the LGI2P at Alès. Visits to Euromov, Montpellier, will also be considered.

Sujet :
Synchronization of temporal events is a key feature for audio-visual perception, motor coordination and social interactions. For example, speech comprehension is enhanced through synchronized perception of visual signals like lip movements [Stev10], and motor coordination relies partly on the ability of coupling movement with audio-visual events [Ald17].

This project will focus on deep learning approaches to reveal temporal synchronization of heterogeneous data, through supervised and unsupervised tasks. Different kind of recurrent networks will be considered, as for example phased lstm [Nei16]. Moreover, we will also evaluate the potential application of spiking neural network [Tav18] on these temporal data.

For applications, we will consider data of different natures and complexities. Firstly, synchrony among groups will be studied through sets of temporal signals. This data originate from experiments on social interaction through the monitoring of oscillatory hand motion. The second application lies in the analysis of rhythmic patterns in audio data. In that scope, we aims at highlighting structural rhythms in music, for example tempo, or finer characterization such as musical groove. Perceptive are automatic drum transcription, and clinically oriented musical recommendation for the treatment of Parkinson’s disease [Coc18].
Thirdly, we aim at analysis of movement from Mocap data and or video. In particular, we will focus on walking, in the scope of diagnostic assistance and movement signature.

The main tasks of this internship are:
• State of art in deep learning for temporal data
• Data collection and annotation
• Application of unsupervised and supervised learning methods
• Evaluation and comparison of approaches

References
[Ald17] Alderisio, F., Fiore, G., Salesse, R. N., Bardy, B. G., & di Bernardo, M. (2017). Interaction patterns and individual dynamics shape the way we move in synchrony. Scientific reports, 7(1), 6846.
[Coc18] Cochen De Cock, V., Dotov, D.G., Ihalainen, P. et al. Rhythmic abilities and musical training in Parkinson’s disease: do they help?. npj Parkinson’s Disease 4, 8 (2018)
[Nei16] Neil, D., Pfeiffer, M., & Liu, S. C. (2016). Phased lstm: Accelerating recurrent network training for long or event-based sequences. In Advances in neural information processing systems (pp. 3882-3890).
[Stev10] Stevenson, R. A., Altieri, N. A., Kim, S., Pisoni, D. B., & James, T. W. (2010). Neural processing of asynchronous audiovisual speech perception. Neuroimage, 49(4), 3308-3318.
[Tav18] Tavanaei, A., Ghodrati, M., Kheradpisheh, S. R., Masquelier, T., & Maida, A. (2018). Deep learning in spiking neural networks. Neural Networks.

Profil du candidat :
Masters in Computer Science, Applied Mathematics, or Signal processing

Formation et compétences requises :
Knowledge in Deep Learning is highly appreciated
Programming skills in Python (and libraries such as pytorch, numpy, scikit-learn or keras)
Knowledge in motion or music analysis is appreciated

Adresse d’emploi :
Research center LGI2P, 7 rue Jules Renard. IMT Mines Alès. 30100 Alès.

Document attaché :