Scheduling Strategies for High Performance Deep Learning

When:
30/06/2018 – 01/07/2018 all-day
2018-06-30T02:00:00+02:00
2018-07-01T02:00:00+02:00

Annonce en lien avec l’Action/le Réseau : aucun

Laboratoire/Entreprise : Inria Bordeaux Sud-Ouest — EPC RealOpt
Durée : 3 ans
Contact : olivier.beaumont@inria.fr,alexis.joly@inria.fr
Date limite de publication : 2018-06-30

Contexte :
Recently, several frameworks such as TensorFlow [1] and PyTorch [2] emerged and represent the DL network as a directed graph whose nodes represent convolution operations and edges represent data dependences between them. The goal of this PhD thesis is to work on how to allocate the convolution operations and how to schedule them to achieve a better efficiency, typically in the context of platforms consisting of heterogeneous resources such as GPUs and multicore nodes.

Sujet :
The goal of this PhD Thesis is to improve the scheduling and resource allocation strategies along several directions. First, the resource allocation algorithm does not take into account the specificities of the application. Indeed, it is for instance close to the default StarPU scheduling algorithm [3] used for general task graphs.
Second, it has been proved that for specific applications such as linear algebra kernels, injecting some static knowledge based on a more sophisticated scheduling algorithm can strongly improved the performance of greedy algorithm [4]. Third, in the context of DL, the same graph of convolution layers is used many times on different input data along the execution of the DL algorithm, what is close to the context of steady state scheduling [5], that has been proved to be more tractable than general scheduling. At last, another opportunity is to develop high level simulation techniques, that could be used in particular to detect bottlenecks with respect to a DL network and to a parallel architecture. This possibility could more speculatively be especially interesting in the context of DL, since it may help to redesign the network itself to cope with bottlenecks. We will first concentrate on classical layers (Fully Connected Layers, Convolutional Layers, Recurrent Layers) before considering Pl@ntNet [6] as a target network.

[1]. M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467, 2016.

[2] Pytorch, http://pytorch.org

[3] C. Augonnet, S. Thibault, R. Namyst, and P.-A. Wacrenier. Starpu: a unified platform for task scheduling on heterogeneous multicore architectures. Concurrency and Computation: Practice and Experience, 23(2):187–198, 2011.

[4] E. Agullo, O. Beaumont, L. Eyraud-Dubois, and S. Kumar. Are static schedules so bad? a case study on cholesky factorization. In Parallel and Distributed Processing Symposium, 2016 IEEE International, pages 1021–1030. IEEE, 2016.

[5] O. Beaumont, A. Legrand, L. Marchal, and Y. Robert. Steady-state scheduling on hetero- geneous clusters. International Journal of Foundations of Computer Science, 16(02):163– 194, 2005.

[6] D. Barthélémy, N. Boujemaa, D. Mathieu, M. Jean-Franc ̧ois, A. Joly, and E. Mouysset. The pl@ntnet project: plant computational identification and collaborative information system. 2011.

Profil du candidat :
These research directions require the joint knowledge of experts in deep learning algorithms, dynamic runtime scheduling and scheduling theory and will benefit in particular to Pl@ntNet application.

Formation et compétences requises :
The PhD student will be localized in Bordeaux (Olivier Beaumont – RealOpt and Samuel Thibault – Storm) and will be co-supervised with the help of Guillaume Charpiat (Tau) and Alexis Joly (Zenith). Several stays (1 week) in Saclay and Montpellier will be scheduled during the PhD Thesis.

Skills

Technical skills and level required: The candidate will be required to have a solid background in Combinatorial Optimization (scheduling, resource allocation, online algorithms) and/or in Deep Learning (TensorFlow, PyTorch) and a taste for both domains.

Remuneration

1st et 2nd year : 1.982 euros brut /month
3rd year : 2.085 euros brut /month

Adresse d’emploi :
Inria Bordeaux Sud-Ouest
Talence
France

Document attaché :