Stage M2 – Non-stationary and robust Reinforcement Learning methodologies for drones detection

When:

16/04/2024 – 17/04/2024 all-day

2024-04-16T02:00:00+02:00

2024-04-17T02:00:00+02:00

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : Laboratoire des signaux et syst`emes (L2S)
Durée : between 4 and 6 mont
Contact : stefano.fortunati@centralesupelec.fr
Date limite de publication : 2024-04-16

Contexte :
Reinforcement Learning (RL) methodologies are currently adopted in different context requiring sequential decision-making tasks under uncertainty. The RL paradigm is based on the perception-action cycle, characterized by the presence of an agent that senses and explores the unknown environment, tracks the evolution of the system state and intelligently adapts its behavior in order to fulfill a specific mission. This is accomplished through a sequence of actions aiming at optimizing a pre-assigned performance metric (reward). Despite of their wide applicability, classical RL algorithms are based on a cumbersome assumption: the stationarity of the environment, i.e. the statistical and physical characterization of the scenario, is assumed to be time-invariant. This assumption is clearly violated in surveillance application, where the position and the number of targets, along with the statistical characterization of the disturbance may change over time. To overcome this limitation and include the non-stationarity in the RL framework, both theoretical and application-oriented non-stationary approaches have been proposed recently in the RL literature. The application of these non-stationary-based line of research to robust radar detection problems has been recently investigated.

Sujet :
The aim of this internship is then to support and complete the ongoing research activity by testing and validating the non-stationary RL algorithms on several realistic scenarios where the radar acts as an agent that continuously senses the unknown environment (i.e., targets and disturbance) and consequently optimizes transmitted waveforms in order to maximize the probability of detection (PD) by focusing the energy in specific range-angle cells. Due to their crucial strategical interest, particular attention will be devoted to scenarios containing drones.

Profil du candidat :
Master 2 or equivalent in machine learning / applied mathematics / statistical signal processing or any related field.

Formation et compétences requises :
machine learning / applied mathematics / statistical signal processing / Matlab/ Python

Adresse d’emploi :
Laboratoire des signaux et systèmes (L2S), Bât IBM, Rue Alfred Kastler, 91400 Orsay.

Document attaché : 202311161045_Internship_proposal_IPSA.pdf

MaDICS

Masses de Données, Informations et Connaissances en Sciences

Big Data - Data Science

Stage M2 – Non-stationary and robust Reinforcement Learning methodologies for drones detection