I am a PhD student in Computer Vision at the University of Barcelona. My research interests include computer vision, deep learning and artificial intelligence. I work under the supervision of Dr. Petia Ivanova Radeva.
PhD in Computer Vision, TBD
University of Barcelona
MSc in Artificial Intelligence, 2023
Polytechnic University of Catalonia (UPC)
BSc in Computer Science, 2020
University of Murcia
BSc in Mathematics, 2020
University of Murcia
[20/01/2025] This week we are hosting the Winter School “Demistifying Artificial Intelligence” for college students from China. 🇨🇳
[05/04/2024] Our latest method, LOFI, has been accepted as an oral presentation at the CVPR'24 Workshop MTF. See you in Seattle! 🇺🇸
[29/10/2023] Very excited to present our work Dining on Details at MADiMa'23 in ACM Multimedia! 🇨🇦 🚀
[01/10/2023] We go to Paris to attend ICCV'23 and present a bunch of interesting projects! 🇫🇷 🚀
[19/09/2023] We presented a poster of our work Dining on Details at the 10th ACMCV in the Computer Vision Center.
2025: CVPR (Conference on Computer Vision and Pattern Recognition), IJCNN, CVPRW (Conference on Computer Vision and Pattern Recognition Workshops), MTF CVPRW Challenge Organizer
2024: WACV (Winter Conference on Applications of Computer Vision)
2023: IEEE Transactions on Multimedia
In the realm of self-supervised learning (SSL), conventional wisdom has gravitated towards the utility of massive, general domain datasets for pretraining robust backbones. In this paper, we challenge this idea by exploring if it is possible to bridge the scale between general-domain datasets and (traditionally smaller) domain-specific datasets to reduce the current performance gap. More specifically, we propose Precision at Scale (PaS), a novel method for the autonomous creation of domain-specific datasets on-demand. The modularity of the PaS pipeline enables leveraging state-of-the-art foundational and generative models to create a collection of images of any given size belonging to any given domain with minimal human intervention. Extensive analysis in two complex domains, proves the superiority of PaS datasets over existing traditional domain-specific datasets in terms of diversity, scale, and effectiveness in training visual transformers and convolutional neural networks. Most notably, we prove that automatically generated domain-specific datasets lead to better pretraining than large-scale supervised datasets such as ImageNet-1k and ImageNet-21k. Concretely, models trained on domain-specific datasets constructed by PaS pipeline, beat ImageNet-1k pretrained backbones by at least 12% in all the considered domains and classification tasks and lead to better food domain performance than supervised ImageNet-21k pretrain while being 12 times smaller.
Dining on Details (DoD) is an innovative fine-grained food classification approach using large language models to sort dataset classes into subsets. Powered by the robust ImageBind embedding space, DoD excels in distinguishing similar classes. Universally compatible, DoD integrates seamlessly with any existing classification architecture. Extensive testing on various food datasets and backbones shows performance boosts of 0.5% to 1.61%, and even achieves SoTA results on the Food-101 dataset.
Generative AI Lecture, Winter School “Demistifying Artificial Intelligence”, 2025 - University of Barcelona: Invited lecturer.
Computer Vision, Bachelor’s Degree in Computer Science, 2024-2025 - University of Barcelona: Lab teacher.