Unlocking the Power of Data Dependencies in Data Pipelines

When:
09/02/2023 all-day
2023-02-09T01:00:00+01:00
2023-02-09T01:00:00+01:00

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : LAMSADE – PSL Research University – Universit{
Durée : 4 à 6 mois
Contact : maude.manouvrier@lamsade.dauphine.fr
Date limite de publication : 2023-02-09

Contexte :
{Data dependencies : relationships or connections between different variables in a dataset. Understanding these dependencies is crucial and has a number of applications.

{Data profiling for Machine Learning: Understanding data dependencies is critical for creating accurate and effective machine learning models. The quality of the input data has a direct impact on the accuracy of the model, and understanding data dependencies helps ensure that the data is suitable for use in machine learning.

Data mining: Data dependencies can help you identify patterns and relationships in the data that may not be immediately obvious. These patterns can be used to make predictions and classify data, making it useful in various data mining tasks such as association rule mining and clustering.

Sujet :
This internship will build upon the recent research in data dependency mining in dynamic settings. As a member of a dynamic team, the student will be exploring innovative ways to compute data dependencies in situations where the data is transformed through a data preparation pipeline. The goal is to assess the impact of this preparation process on the dependencies within the data, as well as its overall quality.

The subject of data dependencies is a critical and fascinating aspect of machine learning and AI, providing students with the opportunity to gain practical skills and explore cutting-edge technologies that are shaping the future of the field. The demand for professionals with skills in machine learning and AI is growing rapidly, and understanding data dependencies is a valuable skill for anyone looking to build a career in this field in both academia and industry. On this point, it is worth noting that the internship is likely to lead to a PhD on a related topic.

Profil du candidat :
We seek for excellent and highly motivated student with a background in Computer Science
having good knowledge of database theory and good programming skills (Python or Java).

Please send the following material in a single PDF document before February 20th, 2023:
– fully detailed CV,
– academic records (master’s degree or equivalent),
– recommendation(s) and supporting letter(s).

Formation et compétences requises :
Background in Computer Science
Good knowledge of database theory and good programming skills (Python or Java).

Adresse d’emploi :
LAMSADE – PSL Research University – Universit{‘e} Paris-Dauphine, Paris, France

Document attaché : 202302091126_IntershipLamsadeDataDependencieInPiplines.pdf