Interactive Data Mining for Root Cause Analysis of Performance Issues in Networks

When:

18/05/2018 – 19/05/2018 all-day

2018-05-18T02:00:00+02:00

2018-05-19T02:00:00+02:00

Annonce en lien avec l’Action/le Réseau : aucun

Laboratoire/Entreprise : IRISA research center in Rennes
Durée : 3 years PHD POSITION (CIFRE) with AdvisorSLA
Contact : elisa.fromont@irisa.fr
Date limite de publication : 18-05-2018

Contexte :
AdvisorSLA is a French company headquartered in Cesson-Sévigné, a city located in the outskirts of Rennes in Brittany. The company is specialized in software solutions for network monitoring. For this purpose, the company relies on techniques of network metrology. AdvisorSLA’s customers are carriers and telecommunications/data service providers that require to monitor the performance of their communication infrastructure as well as their QoE (quality of service). Network monitoring is of tremendous value for service providers because it is their primary tool for proper network maintenance. By continuously measuring the state of the network, monitoring solutions detect events (e.g., overloaded router) that may degrade the network’s operation
and the quality of the services running on top of it (e.g., video transmission could become choppy).

Sujet :
When a monitoring solution detects a potentially problematic sequence of events, it triggers an alarm so that the network manager
can take actions. Those actions can be preventive or corrective. Some statistics show that only 40% of the triggered alarms are conclusive, that is, they manage to signal a well-understood problem that requires an action from the network manager. This means that the remaining 60% are presumably false alarms. While false alarms do not hinder network operation, they do incur an important cost in terms of human resources. Thus, in this thesis we propose to characterize conclusive and false alarms. This will be achieved by designing automatic methods to “learn” the conditions that most likely precede the fire of each type of alarm, and therefore predict whether the alarm will be conclusive or not. This can help adjust existing monitoring solutions in order to improve their accuracy. Besides, it can help network managers automatically trace the causes of a problem in the network.

The aforementioned problem has an inherent temporal nature: we need to learn which events occur before an alarm and in which order. Moreover, metrology models take into account the measurements of different components and variables of the network such as latency and packet loss. For these two reasons, we resort to the field of multivariated time sequences and time series. The fact that we know the “symptoms” of an alarm and whether it is conclusive or not, allows for the application of supervised machine learning and pattern mining methods. In the realm of machine learning, detecting the class of an alarm is a classification problem. Since machine learning methods have traditionally not been concerned with the interpretability of their verdicts, we envision to enhance our methods with discriminative pattern mining techniques. Such techniques, for example, can find comprehensible sequences of events that occur more frequently before false alarms, e.g., data transmission between two network components A and B. In a classical setting, discriminative pattern mining approaches deal with static data. Thus, they have no ways (other than statistical) to evaluate the real relevance of the discovered patterns. In our scenario, however, we can establish a feedback loop for our pattern mining algorithm: The events represented by the patterns can be reproduced in the network in order to either verify/reject the pattern’s validity or refine it with additional context information, e.g., figuring out that the faulty transmission between components A and B occurs for video packets. This will be the first effort to integrate a pattern discovery algorithm inside a feedback loop, and study the actual relevance of the extracted patterns.
The scientific challenge lies on the design of such feedback loop. There will be many patterns to test and each test will incur some cost in terms of time and network bandwidth. Hence, the core problem is how to identify the real cause of an issue with a limited test budget, or in other words, how to prioritize the patterns for testing.

Such feedback scheme can constitute the base for the development of online methods for alarm classification and root cause analysis of faults. In this spirit the monitoring system can automatically learn the relevant set of patterns that characterize network faults and adjust the behavior of the alarms as the network operation evolves. Finally, in a different line of thought, the system could also immerse the user in the process by providing detailed information about inconclusive patterns and asking for user’s feedback.

Profil du candidat :
We look for highly motivated candidate with the following skills/diploma:
* A master’s degree in computer science;
* Some background in data mining in general and pattern mining in particular;
* Some proven skills in programming;
* A very good level (written and oral) in English and a good ability to communicate with others;
* The ability to work autonomously.

Formation et compétences requises :
We look for highly motivated candidate with the following skills/diploma:
* A master’s degree in computer science;
* Some background in data mining in general and pattern mining in particular;
* Some proven skills in programming;
* A very good level (written and oral) in English and a good ability to communicate with others;
* The ability to work autonomously.

Adresse d’emploi :
Send your application to ALL the following email addresses: luis.galarraga@inria.fr, elisa.fromont@irisa.fr, alexandre.termier@irisa.fr
Your application must contain:
1) a CV,
2) your last grade certificate (if you are currently finishing your Master’s degree, we need an official list of the grades you obtained so far in this degree with your rank among your peers),
3) at least two recommendation letters,
4) a specific motivation letter (applications with generic motivation letters will not be considered).
The applications are opened until the 18th of May.

Some interviews will be offered between the 22nd and the 25th of May.
The final decision will be given at the end of May.
The PhD thesis is expected to start in September (or October) 2018.

Document attaché : advisorsla.pdf

MaDICS

Masses de Données, Informations et Connaissances en Sciences

Big Data - Data Science

Interactive Data Mining for Root Cause Analysis of Performance Issues in Networks