Specifying and Reasoning about Preferences over Inconsistent Knowledge Bases

When:
15/12/2020 – 16/12/2020 all-day
2020-12-15T01:00:00+01:00
2020-12-16T01:00:00+01:00

Offre en lien avec l’Action/le Réseau : RoD/– — –

Laboratoire/Entreprise : LaBRI – Laboratoire Bordelais de Recherche en Info
Durée : 5 mois
Contact : meghyn.bienvenu@labri.fr
Date limite de publication : 2020-12-15

Contexte :
Accessing the relevant information contained in real-world data to support informed decision making is difficult, time-consuming, and error-prone due to the need to integrate data across multiple heterogeneous sources. Moreover, even if this first hurdle is overcome, a perhaps even more daunting challenge arises: how to obtain reliable insights from imperfect data? It is widely acknowledged that real-world data is plagued with quality issues, such as incompleteness (missing information) and errors (false or outdated information).

The ontology-mediated query answering (OMQA) paradigm facilitates access to (potentially heterogeneous) data sources through the use of ontologies that specify a convenient user-friendly vocabulary for query formulation (which abstracts from the way the data is stored) and capture domain knowledge that can be exploited at query time, via automated reasoning, to obtain more complete query results. For example, querying for patients with infectious heart disease is non-trivial due to the myriad of ways such a generic condition can manifest, but by leveraging the knowledge formalized in medical ontologies (like SNOMED CT), it is possible to correctly return patients diagnosed with Chagall’s disease, toxoplasma myocarditis, etc. The OMQA approach is relevant to a wide range of data-intensive applications, and recent industrial projects have witnessed its practical benefits.

While OMQA systems are growing in maturity, they too often fail to address the data quality issue, aside from issuing warnings when inconsistencies are discovered. To widen the applicability of the OMQA approach, it is essential to equip OMQA systems with appropriate mechanisms for handling imperfect data: how to obtain meaningful answers to queries posed over imperfect data, and how best to generate a high-quality version of the data ?

The Master’s internship is part of the INTENDED Chair on Artificial Intelligence, whose aim is to develop intelligent, knowledge-based methods for handling imperfect data. A PhD position on a related topic is available.

Sujet :
Several different inconsistency-tolerant semantics have been proposed with the aim of providing meaningful answers to queries posed over inconsistent knowledge bases. Recent work has focused on how to integrate preferences into such semantics in order to exploit information about the relative reliability of facts in the data.

The aim of this internship is to explore declarative languages for specifying preferences between facts in the OMQA context. Specifically, we envision rule-based preference languages, along the lines of “If the data contains Salary(EMP,s1) and Salary(EMP,s2) and s1>s2, then prefer Salary(EMP,s1) to Salary(EMP,s2)” or “If fact1 and fact2 are in contradiction, fact1 was inserted after fact2, and fact2 is not from source A, then prefer fact1 over fact2”.

After defining a syntax and semantics for such preference rules, the student will investigate the associated reasoning tasks: Can we decide whether a given set of preference rules always yields an acyclic preference relation? Do the rules always define a total relation (i.e. precisely determine how to correct the data)? How does adopting preference rules rather than assuming an explicit preference relation affect the complexity of query answering under preference-based semantics?

Profil du candidat :
This is a foundational research topic and no programming or implementation will be done during the internship. Rather the student will define formally a preference representation language and study its properties (with formal arguments and proofs).

Formation et compétences requises :
Candidates should be currently enrolled in a M2 program in computer science (or possibly mathematics, if accompanied by relevant computer science background).

Candidates should have some prior experience with logic, and knowledge of one of more of the following topics would be helpful: knowledge representation and reasoning (in particular, description logics), Semantic Web (ontologies), database theory, logic in AI, theoretical computer science (computational complexity).

Knowledge of French is not required, while strong English skills are desired. The working language can be either French or English.

Adresse d’emploi :
LaBRI, Université de Bordeaux, Talence, France

Document attaché : 202011111016_master1-intended.pdf