A data flux comparison among different Distributed Frequents Itemset Mining Algorithms over MapReduce platform

When:
30/04/2016 – 01/05/2016 all-day
2016-04-30T02:00:00+02:00
2016-05-01T02:00:00+02:00

Annonce en lien avec l’Action/le Réseau : aucun

Laboratoire/Entreprise : ETIS – ENSEA / Université de Cergy-Pontoise / CNRS
Durée : 6 mois
Contact : Tao-Yuan.Jen@u-cergy.fr
Date limite de publication : 2016-04-30

Contexte :
Object: Internship Master / Engineer

Place: Paris Area, Université de Cergy-Pontoise, Cergy-Pontoise, France

Subject: A data flux comparison among different Distributed Frequents Itemset Mining Algorithms over MapReduce platform

Period: 6 months internship from April/May to September/October 2015 – approx. 508€/month

For further information on the internship subject please contact:
Tao Yuan Jen

Sujet :
Description: This internship subject deals with two research fields: Data Mining and Cloud Computing.

The objective of the internship is
(1) to implement or find the source code for the following Distributed Frequents Itemset Mining Algorithms over MapReduce platform :
MRApriori algorithm, IMRApriori algorithm, SPC and DPC algorithms, DPFPM algorithm, Mreclat algorithm, and Apriori-V algorithm.
(2) to compare the mining performance, the quantity of data distributed in each data node before the mining work and the quantity of data communicated among each node in the mining work among these algorithms.
(3) to develop or find the source code for a vertical data layout bitmap converter, if it is necessary, for the preparation of data sets in different experiences.
(4) to study and implement, if it’s possible with the time constraint, some improvements for Apriori-V algorithm.

This internship will contribute in order to:
1. understand different waysof working of the main types of Distributed Frequents Itemset Mining Algorithms over MapReduce platform;
2. clarify the utilisations and the flux of different data types in Distributed Frequents Itemset Mining Algorithms over MapReduce platform;
3. plan our future development and improvements for our ongoing studies related to Apriori-V algorithm in this domain.

The internship is available immediately, will take place at the ETIS Lab (ENSEA / UCP / CNRS UMR 8051) located at Cergy Pontoise in the Paris area and will last for 6 months.

For further information on the internship subject please contact:
Tao Yuan Jen

Profil du candidat :
Engineer/Master2

Formation et compétences requises :
The candidate should be familiar to data mining techniques and the MapReduce platform.

Adresse d’emploi :
2 avenue Adolphe-Chauvin
BP 222, Pontoise
95302 Cergy-Pontoise cedex

Document attaché :