Maxime Portaz - Information Access in mobile environment for museum visits - Deep Neraul Networks for Instance and Gesture Recognition

Organized by: 
Maxime Portaz
Maxime Portaz


Jury :

  • Philippe  Mulhem, charge de recherche, CNRS delegation Alpes, directeur de thèse
  • Véronique Eglin, professeur, INSA Lyon, rapporteur
  • Linda  Tamine-Lechani, professeur, Universite Toulouse-III-Paul-Sabatier, rapporteur
  • Jean-Pierre Chevallet, maitre de conferences, Universite Grenoble Alpes, codirecteur de thèse
  • Hervé  Glotin, professeur, Universite de Toulon et du Var, examinateur
  • Denis  Pellerin, professeur, Universite Grenoble Alpes, examinateur


This thesis is part of the GUIMUTEIC project, which aim is to equip museum tourist with an audio-guide enhanced by a camera. 
This thesis adresses the problem of information access in mobile environment, by automaticaly providing information about museum artefacts. 
To be able to give this information, we need to know when the visitor desire guidance, and what he is looking at, to give the correct response. 
This raises issues of identification of points of interest, to determine the context, and identification of user gestures, to meet his demands. 
As part of our project, the visitor is equipped with an embedded camera. 
The goal is to provide a solution to help with the visit, developing vision methods for object identification, and gesture detection in first-person videos. 
We propose in this thesis a study of the feasibility and the interest of the assistance to the visit, as well as the relevance of the gestures in the context of the interaction with an embedded system. 
We propose a new approach for objects identification thanks to siamese neural networks to learn images similarity and define regions of interest. 
We are also exploring the use of small networks for gesture recognition in mobility. 
We present for this an architecture using new types of convolution blocks, to reduce the number of parameters of the network and allow its use on mobile processor. 
To evaluate our proposals, we rely on several corpus of image search and gestures, specificaly designed to match the constraints of the project.