Auditing the mutations of online AI-models

When:
31/12/2022 – 01/01/2023 all-day
2022-12-31T01:00:00+01:00
2023-01-01T01:00:00+01:00

Offre en lien avec l’Action/le Réseau : – — –/Doctorants

Laboratoire/Entreprise : Inria Rennes/ PEReN
Durée : 3 years
Contact : gtredan@laas.fr
Date limite de publication : 2022-12-31

Contexte :
AI-based decision-making systems are now pervasive, serving populations in most parts of their online interactions (i.e., curation such as recommendation [3], pricing [1] or search algorithms [5]). These systems have demonstrated high level performances lately [10], so it comes with no surprise that having AI-based models to face users is now a common ground for the tech industry (called the platforms hereafter).

Yet, the massive use of AI-based model raises concerns, for instance regarding their potentially unfair and/or discriminatory decisions. It is then of a societal interest to develop methods to audit the behavior of an online model, to verify its lack of bias [12], proper use of user data [11], or compliance to laws [7]. The growing list of known audit methods is slowly consolidating into the emerging field of algorithmic audit of AI-based decision making algorithms, and multiple directions are yet to be explored for expanding that nascent field.

*Contact*

Lucas Verney, PEReN, lucas.verney@finances.gouv.fr
Erwan Le Merrer, Inria, erwan.le-merrer@inria.fr
Gilles Tredan, LAAS-CNRS, gtredan@laas.fr

Sujet :

*The notion of mutation and the distance to a landmark model*

While audits are by essence punctual, the audited models often continuously evolve, for instance because of reinforcement learning, retraining on new user inputs, or simply because of code updates pushed by the platform operators. An unexplored direction of interest, that might be crucial for instance to regulators, is to provide means to observe the mutation of an online model. Assume a platform model under scrutiny, and an auditor that has only access to that model solely by means of queries/responses. This is coined as a black-box access to a model in the literature. Through these basic actions, an open research question is the proper definition of what is a stable model,i.e., a model that is consistent in time with regards to its decisions, (and consequently does not mutate). While there has been a couple of approaches to define techniques of tampering-detection of a model [6, 4], this definition is bound to classifiers and to the sole capability of checking if the model is the same or if it is different.

*Objectives*

A more refined way would be to provide a quantification for mutation, that is a notion of a distance between two instances, one being a model, possibly owned locally by an auditor, the other being a variant of the model that has already mutated. How to define and design a practical and robust distance measure is the topic of this Ph.D thesis. This opens up multiple questions:
•How should such a setup be modeled (statistical modeling, use of information theory, similarities from the datamining field, etc), so that we are able to provide a well defined measure for that problem. Moreover, while standard approaches exist to evaluate the divergence between two models, those need to be adapted to the context. In particular, we seek practical approaches that estimate divergence using few requests.
An example of a modeling can rely on graphs. One can indeed structure the data collected from the observed model under relations forming a graph (see e.g., [8] in the context of the YouTube recommender), and compare that graph to the structure of a desirable graph with respect to the properties that are awaited from the platform.

•Such AI models are nowadays used in a large variety of tasks (such as classification, recommendation or search). How does the nature of the tasks influences the deviation estimation/detection ?

•Considering that the auditor tracks deviation tracking, with regards to a reference point, is it possible to identify the directionin the mutation? That is particularly interesting in order to assess if a model evolves towards compliance with law requirements.

•Taking the opposite (platform) side: are there ways to make this distance measurements impossible, or at least noisy, so that it is impossible for the auditor to issue valuable observations? (we will relate this to impossibility proofs). In other words, can we model adversarial platform behaviours that translateinto increased auditing difficulty ?

Profil du candidat :
*Work Plan*

•A state of the art will review past approaches to observe algorithms in a black-box. This relates to the fields of security (reverse engineering), machine learning (with e.g., adversarial examples), and computability [9].
•We plan to approach the problem by leveraging a large AI model made public (e.g.,https://pytorch.org/torchrec/), and mutate it by fine-tuning for instance, so that we can get intuition about the problem, and to allow testing the first distances we have identifed.

•Provide a first consistent benchmark from these various distances. In particular, an important aspectwill be their precision depending on the query budget necessary to obtain them (precision/cost tradeoffin the requests to the black-box)
•Once the optimum distance for our problem has been found, the followup work will be devoted to prevent its construction by designing countermeasures on the platform side. In short, design an adversary capable to create important noise in the measurement by the auditor. This can relate for instance to the notion of randomized smoothing in the domain of classifiers [2].
•This cat-and-mouse game between the auditor and the platform will structure and help create the impossibility proofs we are seeking to propose, in order to provide algorithmic landmarks for scientists and regulators.

*Ph.D. Thesis Supervision and Location*

The Ph.D. student will be welcomed in teams that are activelyworking on the topic of algorithmic auditing of AI models (both from the practical and theoretical sides),in Paris and/or in Rennes. The supervisory team will be the WIDE team in Inria Rennes. In particular, the Ph.D. student will have the opportunity to be welcomed for extended periods at PEReN (https://www.peren.gouv.fr/en/), a French government service developing and implementing algorithmic audit methods, conjointly with Inria, in order to enable benchmarking digital platforms compliance to legislations.

Formation et compétences requises :
*Desired skills for the Ph.D. candidate*

•Advanced skills in machine learning (classification, regression, adversarial examples)
•A strong formal and theoretical background. Interest in the design of algorithms is a plus.
•Good scripting skills (e.g., Python) and/or familiar with statistical analysis tools (e.g., R)

Adresse d’emploi :
Inria Rennes