Événements du LITIS

Chargement Évènements

« Tous les Évènements

  • Cet évènement est passé

Soutenance de thèse de Sovann EN le 16/11/16, 14h, amphithéâtre D, UFRST Madrillet

11/16/2016 @ 14:00 - 17:00

Intitulé : « Détection de patterns dans les documents anciens »
Titre en Anglais: « Historical document image retrieval and pattern pharmacie-ed.net/Levitra.html spotting »

Les rapporteurs
Mme Muriel VISANI, Université de la Rochelle
M. Philippe-Henri GOSSELIN, ENSEA Cergy

Les examinateurs
Mme Véronique EGLIN, INSA de Lyon
M. Jean-Yves RAMEL, Université de Tours

Directeur de thèse
M. Laurent HEUTTE, Université de Rouen

Codirecteur de thèse
M. Frédéric JURIE, Université de Caen

Encadrants
Mme Caroline PETITJEAN, Université de Rouen
M. Stéphane NICOLAS, Université de Rouen

Abstract:
This thesis addresses the problem of retrieving and spotting patterns in historical document images. In particular, we are interested in searching for small graphical objects (20*20 pixels) in degraded, noisy and unconstrained layout document images. In addition, the hand-drawn patterns in historical document images makes our problem become even more challenging due to intra-class variabilities.

Searching for generic graphical patterns in unconstrained layout document images requires an exhaustive matching at every possible size and location. This exhaustive search is not only computationally expensive but produces also usually a lot of false alarms. To overcome this problem, we propose an efficient indexing strategy based on a background removal component followed by region proposal to estimate if a given region contains or not an object. This enables us to reduce the number of sub-windows by 7 times while maintaining a high level of recall. Then, based on an exhaustive experimentation between recent feature extraction techniques, VLAD is chosen as our image representation. Finally, by observing that the conventional distance measures (e.g. cosine) can not cope well with image variabilities, we propose an adaptive distance function learned on the fly at almost no cost and without the need of labeled data. Our historical document image retrieval system is then extend
ed by int

egrating a localization component, thus turning the retrieval system into a pattern pharmacie-ed.net/Levitra.html spotting system, enhancing its capability to locate more precisely objects of interest. While our system produces some meaningful results, we go beyond by coping with scalability issues. We show that we can efficiently retrieve/spot an object in less than a second in up to millions of sub-windows. We also developed two other systems from these ideas. First, we show on various public datasets that our adaptive distance is more powerful than the conventional distance functions for natural scene image retrieval. Second, we show that our system can be turned into a word spotting system with only few complimentary components, thus demonstrating the robustness of the proposed method.

All our experiments have been carried out on a new dataset called « DocExplore ». Experiments show the robustness of our systems to retrieve/spot graphical objects in noisy and degraded document images. Finally, our last contribution is the release of our DocExplore dataset for public along with the experimental protocol and evaluation metrics to further encourage other researchers to continue tackle these problems.

 

 

Détails

Date :
11/16/2016
Heure :
14:00 - 17:00

Lieu

UFR ST Madrillet – Rouen
Avenue de l'Université
Saint-Étienne-du-Rouvray, 76800 France
+ Google Map
Site Web :
http://www.univ-rouen.fr
Publié dans
dapibus commodo velit, adipiscing Donec consequat. libero et,