Anuvabh Dutt - Continual learning for image classification

13:30
Tuesday
17
Dec
2019
Organized by: 
Anuvabh Dutt
Speaker: 
Anuvabh Dutt
Teams: 

 

Jury:

  • Mme Jenny Benois-Pineau, professeur, Université Bordeaux, rapporteur
  • M. Nicolas Thome, professeur, CNAM, rapporteur
  • M. Hervé Le Borgne, chercheur, CEA LIST, examinateur
  • M. Masih-Reza Amini, professeur, Université Grenoble Alpes, examinateur
  • M. Denis Pellerin, professeur, Université Grenoble Alpes, co-directeur de thèse
  • M. Georges Quénot, directeur de recherche, CNRS, directeur de thèse

This thesis deals with deep learning applied to image classification tasks. The primary motivation for this work is to make current deep learning techniques more efficient and to deal with changes in the data and label distribution. We work in the broad framework of continual learning, with the aim to have in the future machine learning models that can continuously improve. The first contribution involves change in label space of a data set, with the data samples themselves remaining the same. We consider a semantic label hierarchy to which the labels belong. We investigate how we can utilise this hierarchy for obtaining improvements in models which were trained on different levels of this hierarchy. The second contribution involves continual learning using a generative model. We analyse the usability of samples from a generative model in the case of training good discriminative classifiers. We propose techniques to improve the selection and generation of samples from a generative model. Following this, we observe that continual learning algorithms do undergo some loss in performance when trained on several tasks sequentially. For the third contribution, we analyse the training dynamics in this scenario and compare with training on several tasks simultaneously. We make observations that point to potential difficulties in the learning of models in a continual learning scenario. Finally for the fourth contribution, we propose a new design template for convolutional networks. This architecture leads to training of smaller models without compromising performance. In addition the design lends itself to easy parallelisation, leading to efficient distributed training. In conclusion, we looked at two different types of continual learning scenarios and we proposed methods that lead to improvements. Our analysis also points to underlying issues that occur while training in a continual learning scenario. In order to overcome these we provided pointers to changes required in the training scheme of neural networks.