Complex graph analysis for the detection of corruption in public procurements

When:
24/07/2020 – 25/07/2020 all-day
2020-07-24T02:00:00+02:00
2020-07-25T02:00:00+02:00

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : Laboratoire Informat
Durée : 3 years
Contact : christine.largeron@univ-st-etienne.fr
Date limite de publication : 2020-07-24

Contexte :
In many applications, the data to be studied is of relational nature, modeled in the form of a network and represented by graphs. This representation makes possible the study of interactions between people, especially in social sciences. Although network analysis is an active branch of data mining and machine learning, a majority of works focuses on homogeneous networks de-scribed by a simple graph where the nodes correspond to the entities of the network and the links (edges or arcs) to their relationships. However, in many applications, contextual infor-mation describing the relationships or the entities themselves are available and could be used to study in a more efficient way the network.
This led notably to the notions of signed graphs and attributed graphs. In a signed graph, each edge is labeled with either a negative or a positive sign, which allows representing antagonistic relationships [1]. In an attributed network, vertices are described by attributes which allow to take into account their individual characteristics, like for instance their genre or age [2].
These representations are much richer than a simple graph and permit to better model com-plex interaction systems. However, they require to adapt existing algorithms or to design new ones for solving efficiently all standard network visualization and analysis tasks, such as com-munity detection, link prediction or information diffusion.

Sujet :
This PhD is part of the French Research project DeCoMaP (Detecting Corruption in Markets for Public Procurement) funded by the ANR (French NSF) which aims at retrieving, processing and analyzing open data related to French public procurements, in order to design a tool able to assess corruption risks between public buyers and suppliers.
Designing automatic tools for the assessment of risks of corruption in public procurements is a task called red flagging. It is not completely new, as some teams have been working on it for a few years [1], especially at the European level [2]. However, none of them applies to French public procurements, as they do not handle its specifics (legal framework, nature and form of the available open data). Moreover, existing approaches focus on individual information, which characterize buyers and suppliers independently, and ignore relational information, which cor-responds to interactions and interdependencies between these agents.
In the context of the project DoCoMaP, we propose to design a new tool tackling these issues and limitations. More specifically, the goal of this thesis is to handle a large methodological part of this work, by representing the relationships between buyers and suppliers (normal con-tractual relationship or corruption) and by characterizing these agents through signed and/or attributed graphs.
As this type of graphs has been much less explored than simple graphs in the literature, the first task of this thesis consists in designing methods to extract them from the raw data, and to analyze them in the context of our application (corruption detection). In particular, it is neces-sary to define a corruption index that could be used to enrich the information already available in the graph at the level of its links (contracts between a public buyer and a supplier) and at the level of its vertices (characteristics of the actors).
The second task focuses on signed graph partitioning in the framework of structural balance [3]. A signed graph is said to be balanced if it can be partitioned into two [4] or more [5] mutu-ally hostile subgroups, each having internal solidarity (i.e. positive edges are inside clusters, and negative ones are between them). Some recent works started tackling this problem notably in our team [6], but there is still much to do for formalizing the problem and solving it efficiently, particularly using deep learning approaches [7]. In DeCoMap, signed graph partitioning will aim to bring out groups of agents (public buyers and supplier companies) likely to be related in fraudulent practices.

Profil du candidat :
The candidate should have a master degree or equivalent in Computer Science.

Formation et compétences requises :
The subject is at the intersection of several domains: graph theory, statistics, data mining and machine learning, big data including databases (the considered networks can be huge). Thus the candidate should have strong backgrounds in several of these topics.
Other required skills:
• Good abilities in algorithm design and programming.
• Good technical skills regarding data management (databases, retrieval from web APIs)
• A very good level (written and oral) in English.
• Good communication skills (oral and written).
• The ability to work in a team with colleagues from other scientific disciplines.
• Autonomy and motivation for research.

Adresse d’emploi :
Laboratoire Informatique d’Avignon (LIA – EA 4128), France

Document attaché : 202004101516_PHDposition-Decomap.pdf