[fg-arc] Postdoc position at INRAE Montpellier, France – Semantic Web, Data linking

Danai Symeonidou danai.symeonidou at inrae.fr
Mon Mar 28 16:31:38 CEST 2022


**

**


  *Postdoc position at MISTEA, INRAE Montpellier, France – Semantic Web,
  Data linking*

*Areas:Semantic Web, Linked Data, Data linking, Representation learning *

*Qualifications: PhD in Informatics, AI. Background in knowledge 
engineering. *

*Context:ANR DACE-DL 
<https://anr.fr/Projet-ANR-21-CE23-0019>(DAta-CEntric AI-driven Data 
Linking)*

*Contact & Collaboration:*

*Danai Symeonidou, danai.symeonidou at inrae.fr 
<mailto:danai.symeonidou at inrae.fr>*

*Clement Jonquet, clement.jonquet at inrae.fr 
<mailto:clement.jonquet at inrae.fr>*

*Dates:Position available for 2 years. Beginning date is flexible. *

*Location:INRAE, Centre Occitanie-Montpellier, MISTEA 
<https://www6.montpellier.inrae.fr/mistea/>research unit*

*Salary:Between 2200€ and 2700€ gross monthly depending on 
qualifications and situation. *

*Institut: INRAE is the French research organization in agriculture, 
food and environmental sciences; it is a pioneer in France in terms of 
data sharing and Open Science commitment. The MathNum research 
department gathers around 200 scientists in mathematics and digital 
technologies in 13 research units in France. MISTEA is a joint research 
unit of INRAE and Montpellier Institut Agro engineering school with 
activities in the development of mathematical, statistical and 
informatics methods dedicated to analysis and decision support for 
agronomy and environment. The team is also recognized for its expertise 
in knowledge engineering and ontology-based scientific data management 
and information systems.*

*Project context: Data linking is the scientific challenge of 
automatically establishing typed links between the entities of two or 
more structured datasets. A variety of complex data linking systems 
exists, evaluated on public benchmarks [1,2,3]. While they have allowed 
for the generation of vast amounts of linked data in the context of 
various dedicated projects, data generic systems often have limited 
applicability in many real-world scenarios, where data are highly 
heterogeneous and domain-specific. The ANR project DACE-DL (2022-2024) 
targets a paradigm shift in the data linking field with a data-centric 
bottom-up methodology relying on machine learning and representation 
learning models [4]. We hypothesize there exists a finite number of 
identifiable and generalisable linking problem types (LPTs), that we 
need  to categorize and analyze to provide better linking results. *

*Topic:The postdoc will work to identify and provide a 
categorisation/taxonomy of the different linking problem types based on 
an in-depth analysis of the linked datasets provided by the project and 
beyond. The first objective is to provide an in-depth analysis of the 
linked data available along with an exhaustive study of the 
state-of-the-art in the field of data linking. A finite number of 
generalisable linking problem types will be classified including the 
relations and inherent structure of the LPTs made explicit to both human 
and machine. The goal is to answer questions such as: are certain LPTs 
or groups of LPTs (e.g. siblings at a given level of the taxonomy) 
specific to a domain, language or a community? Are certain LPTs inherent 
to specific types of data? Once a formal taxonomy of LPTs is produced, 
various datasets will be manually annotated. These annotations on 
existing pairs of datasets will be used to learn, using machine learning 
strategies, features for the automatic categorization of other datasets. 
The postdoc will co-supervise a PhD student working on the machine 
learning methods.*

*Application: Send application to the contact emails including:*

  *

    *a short description of introducing yourself *

  *

    *your adequacy to the position *

  *

    *a CV and *

  *

    *one major publication*

*References*

*[1] M. Nentwig, (...) E. Rahm. A survey of current link discovery 
frameworks, Semantic Web, 2017.*

*[2] Euzenat, J., (...), Trojahn, C. Ontology matching benchmarks: 
generation, stability, and discriminability. Web Semantics, 2013.*

*[3] Zhou, L, (...), Trojahn, C., Zamazal, O: Towards evaluating complex 
ontology alignments. Knowl. Eng. Rev., 2020.*

*[4] Todorov, K. Datasets First! A Bottom-up Data Linking Paradigm. ISWC 
2019 Satellite Tracks, Auckland, New Zealand, October 26-30, 2019.*

*



*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.uni-paderborn.de/pipermail/fg-arc/attachments/20220328/b6de70b8/attachment.htm>


More information about the fg-arc mailing list