Anton Tchechmedjiev - Interopérabilité sémantique multilingue des ressources lexicales en données lexicales liées ouvertes

07:00
Friday
14
Oct
2016
Place: 
Organized by: 
Andon Tchechmedjiev
Speaker: 
Andon Tchechmedjiev
Teams: 

 

Membres du Jury :

  • M. Eric Gaussier, Université Grenoble Alpes, Pr, examinateur/président.
  • M. Roberto Navigli, Université Sapienza di Roma, Pr, rapporteur.
  • M. Mathieu Lafourcade, Université de Montpellier, MCF HDF, rapporteur.
  • M. Denis Maurel, Université François Rablais, Tours, Pr, examinateur.
  • M. Nabil Hathout, IRIT, Toulouse, DR CNRS, examinateur.
  • M. Gilles Sérasset, Université Grenoble Alpes, MCF, directeur de thèse. 
  • M. Jérôme Goulian, Université Grenoble Alpes, MCF, codirecteur de thèse. 
  • M. Didier Schwab, Université Grenoble Alpes, MCF, invité.

 

When it comes to the construction of multilingual lexico-semantic resources, the first thing that comes to mind is that the resources we want to align should share the same data model and format (representational interoperability). However, with the emergence of standards such as LMF and their implementation and widespread use for the production of resources in the form of lexical linked data (Ontolex), representational interoperability has ceased to be a major challenge for the production of large-scale multilingual resources. However, as far as the interoperability of sense-level multilingual alignments is concerned, a major challenge is the choice of a suitable interlingual pivot. Many resources make the choice of using English senses as the pivot (e.g. BabelNet, Euro- WordNet), although this choice leads to a loss of contrast between English senses that are lexicalized with different words in other languages. The use of acception-based interlingual representations, a solution proposed over 20 years ago, could be viable. However, the manual construction of such language-independent pivot representations is very difficult due to the lack of experts speaking enough languages fluently and algorithms for their automatic constructions have never materialized, mainly because of the lack of a formal axiomatic characterization that ensures the preservation of their correctness properties. In this thesis, we address this issue by first formalizing acception-based interlingual pivot architectures through a set of axiomatic constraints and rules that guarantee their correctness. Then, we propose algorithms for the initial construction and the update of interlingual acception-based multilingual resources by exploiting the combinatorial properties of pairwise bilingual translation graphs. Secondly, we study the practical considerations of applying our construction algorithms on a tangible resource, DBNary (a lexical linked data resource extracted from Wiktionary).