Towards an infrastructure for sourced, reproducible and verifiable knowledge graphs.

When:
26/12/2020 – 27/12/2020 all-day
2020-12-26T01:00:00+01:00
2020-12-27T01:00:00+01:00

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : CNRS/LIRIS – INSA de Lyon
Durée : 36 months – starting
Contact : Sylvie.Cazalens@insa-lyon.fr
Date limite de publication : 2020-12-26

Contexte :
The thesis will take place in the database team (DB) of the LIRIS laboratory, Campus de la Doua, Lyon-Villeurbanne (liris.cnrs.fr)

It is part of the ANR project DeKaloG (2021-2024) which aims at a general framework to build community, decentralized knowledge graphs according to principles of accessibility and transparency. The project gathers the efforts of three teams: GDD (LS2N, Nantes), Wimmics (Inria, Sophia Antipolis and Université Côte d’Azur) and BD (LIRIS, Lyon).

Sujet :
The general aim of this thesis is to contribute to the DeKaloG framework, with a focus on transparency, and more particularly on reproducibility: knowledge within the gragh (or the graph itself) should be reproducible and verifiable. For facts deduced within the graph, one may rely on works about provenance. However, for facts obtained using external tools and directly introduced into the graph, questions about provenance, reproducibility and verifiability have to be addressed. The following objectives should be targeted:
– Defining requirements for an extensible model of transparency, up to reproducibility. A first step consists in drawing a whole picture of the needs in the context of knowledge graphs, leveraging results in other different related domains such as linked data and semantic web, but also some achievements in other scientific domains (medicine, biology, etc). A second step consists in designing an extensible model of different levels of transparency, that can be queried, consistent with the current semantic web standards.

– Use of the proposed model to enable more transparency in knowledge graphs. This requires to inject more metadata into knowledge graphs, which raises problems of data volume and thus performances. This is a major hindrance for scalability. Recent approaches to this problem provide a starting point. Additionally, linked data and workflows can be intertwined to push transparency up to reproducibility.

– Estimating/verifying the transparency degree of a KG. One should be able to obtain information qualifying and quantifying the transparency degree of a knowledge graph she wants to use. This is also very important when building an index of knowledge graphs.

Hence, this thesis should result in an infrastructure enabling to link a knowledge graph with external solutions, accessed through services, for KGs and facts to be reproducible and verifiable by anyone.

Profil du candidat :
Applicants should have both theoretical and applied skills in computer science, in particular, a good knowledge of semantic web/knowledge graphs foundations and a good practice of associated tools. A background in the domain of workflows would appreciated.

Formation et compétences requises :
Any diploma equivalent to a french “master en informatique”

Adresse d’emploi :
LIRIS – UMR 5205 CNRS
Bâtiment Blaise Pascal – INSA Lyon
7 avenue Jean Capelle, 69100 Villeurbanne
France

Document attaché : 202010081715_ThesisReproducibleKG.pdf