[LatinX ICCV'23] Harnessing Automated Hierarchies for Triplet Contrast-based Fine-grained Recognition

Abstract

Fine-grained classification is a complex classification problem in which the objective is to distinguish between classes that are very similar to each other. The ability of triplet loss to model relations between samples makes it a good alternative for fine-grained settings. In this work, we propose to create an adaptive three-level hierarchy of samples in order to exploit this information via multi-level triplet contrast. The negatives and the positives are sampled from a queue, which allows higher control over the variety and computational cost of sampling. We take advantage of cross-modal information thanks to a Universal Sentence Encoder to seamlessly find similar categories and group them together. In addition, we use K-means to dynamically find subclasses within fine-grained categories. Experiments show that the proposed method results in significant improvements in the accuracy of two popular fine-grained classification benchmarks. The results include an improvement of +0.58 in CUB-200-2011.

Date
Oct 3, 2023 3:55 PM — 5:10 PM
Location
Paris Convention Centre
1 Place de la Porte de Versailles, Paris, Île-de-France 75015
This poster corresponds to work in progress (extended abstract). No published paper is available yet.
Jesús M. Rodríguez-de-Vera
Jesús M. Rodríguez-de-Vera
PhD Candidate in Computer Vision

My research interests include computer vision, deep learning and artificial intelligence.