Profile avatar
arxiv-cs-cv.bsky.social
Computer Science -- Computer Vision and Pattern Recognition (cs.CV) source: export.arxiv.org/rss/cs.CV maintainer: @tmaehara.bsky.social
34,351 posts 895 followers 0 following
Prolific Poster

Mira Adra, Simone Melcarne, Nelida Mirabet-Herranz, Jean-Luc Dugelay Event-based Solutions for Human-centered Applications: A Comprehensive Review https://arxiv.org/abs/2502.18490

Hongpu Huang, Wei Zhou, Chen Wang Physical Depth-aware Early Accident Anticipation: A Multi-dimensional Visual Feature Fusion Framework https://arxiv.org/abs/2502.18496

Chuanguang Yang, Xinqiang Yu, Han Yang, Zhulin An, Chengqing Yu, Libo Huang, Yongjun Xu Multi-Teacher Knowledge Distillation with Reinforcement Learning for Visual Recognition https://arxiv.org/abs/2502.18510

Jianjian Li, Junquan Fan, Feng Tang, Gang Huang, Shitao Zhu, Songlin Liu, Nian Xie, Wulong Liu, Yong Liao FCoT-VL:Advancing Text-oriented Large Vision-Language Models with Efficient Visual Token Compression https://arxiv.org/abs/2502.18512

Mangsura Kabir Oni, Tabia Tanzin Prama Optimized Custom CNN for Real-Time Tomato Leaf Disease Detection https://arxiv.org/abs/2502.18521

Eric Xue, Zeyi Huang, Yuyang Ji, Haohan Wang IMPROVE: Iterative Model Pipeline Refinement and Optimization Leveraging LLM Agents https://arxiv.org/abs/2502.18530

Ehsan Farahbakhsh, Dakshi Goel, Dhiraj Pimparkar, R. Dietmar Muller, Rohitash Chandra Convolutional neural networks for mineral prospecting through alteration mapping with remote sensing data https://arxiv.org/abs/2502.18533

S M Sarwar FilterRAG: Zero-Shot Informed Retrieval-Augmented Generation to Mitigate Hallucinations in VQA https://arxiv.org/abs/2502.18536

Xuechun Li, Susu Xu Multi-class Seismic Building Damage Assessment from InSAR Imagery using Quadratic Variational Causal Bayesian Inference https://arxiv.org/abs/2502.18546

Erick da Silva Farias, Eduardo Palhares Junior Application of Attention Mechanism with Bidirectional Long Short-Term Memory (BiLSTM) and CNN for Human Conflict Detection using Computer Vision https://arxiv.org/abs/2502.18555

Akash Vartak, Khondoker Murad Hossain, Tim Oates DeBUGCN -- Detecting Backdoors in CNNs Using Graph Convolutional Networks https://arxiv.org/abs/2502.18592

Miguel Herencia Garc\'ia del Castillo, Ricardo Moya Garcia, Manuel Jes\'us Cerezo Maz\'on, Ekaitz Arriola Garcia, Pablo Men\'endez Fern\'andez-Miranda Diffusion Models for conditional MRI generation https://arxiv.org/abs/2502.18620

Saorj Kumar, Prince Asiamah, Oluwatoyin Jolaoso, Ugochukwu Esiowu Enhancing Image Classification with Augmentation: Data Augmentation Techniques for Improved Image Classification https://arxiv.org/abs/2502.18691

Anthony Etim, Jakub Szefer Adversarial Universal Stickers: Universal Perturbation Attacks on Traffic Sign using Stickers https://arxiv.org/abs/2502.18724

Hemanth Teja Yanambakkam, Rahul Chinthala Beyond RNNs: Benchmarking Attention-Based Image Captioning Models https://arxiv.org/abs/2502.18734

Shaheer Mohamed, Tharindu Fernando, Sridha Sridharan, Peyman Moghadam, Clinton Fookes Spectral-Enhanced Transformers: Leveraging Large-Scale Pretrained Models for Hyperspectral Object Tracking https://arxiv.org/abs/2502.18748

Chenyang Zhao, Kun Wang, Janet H. Hsiao, Antoni B. Chan Grad-ECLIP: Gradient-based Visual and Textual Explanations for CLIP https://arxiv.org/abs/2502.18816

Yunmei Huang, Songlin Hou, Zachary Nelson Horve, Songlin Fei BarkXAI: A Lightweight Post-Hoc Explainable Method for Tree Species Classification with Quantifiable Concepts https://arxiv.org/abs/2502.18844

Junxiao Ma, Jingjing Wang, Jiamin Luo, Peiying Yu, Guodong Zhou Sherlock: Towards Multi-scene Video Abnormal Event Extraction and Localization via a Global-local Spatial-sensitive LLM https://arxiv.org/abs/2502.18863

Akhil Penta, Vaibhav Adwani, Ankush Chopra Enhanced Transformer-Based Tracking for Skiing Events: Overcoming Multi-Camera Challenges, Scale Variations and Rapid Motion -- SkiTB Visual Tracking Challenge 2025 https://arxiv.org/abs/2502.18867

Youngtae Kim, Soonju Jeong, Sardar Arslan, Dhananjay Agnihotri, Yahya Ahmed, Ali Nawaz, Jinhee Song, Hyewon Kim Inscanner: Dual-Phase Detection and Classification of Auxiliary Insulation Using YOLOv8 Models https://arxiv.org/abs/2502.18871

Wanyi Li, Wei Wei, Yongkang Luo, Peng Wang Brain-inspired analogical mixture prototypes for few-shot class-incremental learning https://arxiv.org/abs/2502.18923

D. Hareb, J. Martinet, B. Miramond Enhanced Neuromorphic Semantic Segmentation Latency through Stream Event https://arxiv.org/abs/2502.18982

Anju Rani, Daniel O. Arroyo, Petar Durdevic FungalZSL: Zero-Shot Fungal Classification with Image Captioning Using a Synthetic Data Approach https://arxiv.org/abs/2502.19038

Vu Tuan Truong Long, Bao Le A Dual-Purpose Framework for Backdoor Defense and Backdoor Amplification in Diffusion Models https://arxiv.org/abs/2502.19047

Tresor Y. Koffi, Youssef Mourchid, Mohammed Hindawi, Yohan Dupuis An Improved 3D Skeletons UP-Fall Dataset: Enhancing Data Quality for Efficient Impact Fall Detection https://arxiv.org/abs/2502.19048

Huiqiang Wang, Mingchen Song, Guoqiang Zhong Dynamic Degradation Decomposition Network for All-in-One Image Restoration https://arxiv.org/abs/2502.19068

Qingyao Tian, Huai Liao, Xinyan Huang, Bingyu Yang, Dongdong Lei, Sebastien Ourselin, Hongbin Liu EndoMamba: An Efficient Foundation Model for Endoscopic Videos https://arxiv.org/abs/2502.19090

Edward G. A. Henderson, Marcel van Herk, Andrew F. Green, Eliana M. Vasquez Osorio An anatomically-informed correspondence initialisation method to improve learning-based registration for radiotherapy https://arxiv.org/abs/2502.19101

Tianle Yang, Luyao Chang, Jiadong Yan, Juntao Li, Zhi Wang, Ke Zhang A Survey on Foundation-Model-Based Industrial Defect Detection https://arxiv.org/abs/2502.19106

Ziyuan Luo, Anderson Rocha, Boxin Shi, Qing Guo, Haoliang Li, Renjie Wan The NeRF Signature: Codebook-Aided Watermarking for Neural Radiance Fields https://arxiv.org/abs/2502.19125

Junlong Ren, Hao Wu, Hui Xiong, Hao Wang SCA3D: Enhancing Cross-modal 3D Retrieval via 3D Shape and Caption Paired Data Augmentation https://arxiv.org/abs/2502.19128

Xuan Ding, Yao Zhu, Yunjian Zhang, Chuanlong Xie A Sliding Layer Merging Method for Efficient Depth-Wise Pruning in LLMs https://arxiv.org/abs/2502.19159

Anton Backhaus, Thorsten Luettel, Mirko Maehlisch Knowledge Distillation for Semantic Segmentation: A Label Space Unification Approach https://arxiv.org/abs/2502.19177

Bernardin Tamo Amougou, Marcelo Pereyra, Barbara Pascal Self-supervised conformal prediction for uncertainty quantification in Poisson imaging problems https://arxiv.org/abs/2502.19194

Linshan Jia EGR-Net: A Novel Embedding Gramian Representation CNN for Intelligent Fault Diagnosis https://arxiv.org/abs/2502.19199

Zekang Weng, Jinjin Shi, Jinwei Wang, Zeming Han HDM: Hybrid Diffusion Model for Unified Image Anomaly Detection https://arxiv.org/abs/2502.19200

Xiankang He, Dongyan Guo, Hongji Li, Ruibo Li, Ying Cui, Chi Zhang Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator https://arxiv.org/abs/2502.19204

Nikita Shvetsov, Thomas K. Kilvaer, Masoud Tafavvoghi, Anders Sildnes, Kajsa M{\o}llersen, Lill-Tove Rasmussen Busund, Lars Ailo Bongo A Lightweight and Extensible Cell Segmentation and Classification Model for Whole Slide Images https://arxiv.org/abs/2502.19217

Tharindu Samarakoon, Kalana Abeywardena, Chamira U. S. Edussooriya Arbitrary Volumetric Refocusing of Dense and Sparse Light Fields https://arxiv.org/abs/2502.19238

Qihang Peng, Henry Zheng, Gao Huang ProxyTransformation: Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Grounding https://arxiv.org/abs/2502.19247

Nadya Abdel Madjid, Murad Mebrahtu, Abdelmoamen Nasser, Bilal Hassan, Naoufel Werghi, Jorge Dias, Majid Khonji EMT: A Visual Multi-Task Benchmark Dataset for Autonomous Driving in the Arab Gulf Region https://arxiv.org/abs/2502.19260

Jiawei Kong, Hao Fang, Sihang Guo, Chenxi Qing, Bin Chen, Bin Wang, Shu-Tao Xia Neural Antidote: Class-Wise Prompt Tuning for Purifying Backdoors in Pre-trained Vision-Language Models https://arxiv.org/abs/2502.19269

Ruben T. Lucassen, Tijn van de Luijtgaarden, Sander P. J. Moonemans, Gerben E. Breimer, Willeke A. M. Blokx, Mitko Veta On the Importance of Text Preprocessing for Multimodal Representation Learning and Pathology Report Generation https://arxiv.org/abs/2502.19285

Ruben T. Lucassen, Sander P. J. Moonemans, Tijn van de Luijtgaarden, Gerben E. Breimer, Willeke A. M. Blokx, Mitko Veta Pathology Report Generation and Multimodal Representation Learning for Cutaneous Melanocytic Lesions https://arxiv.org/abs/2502.19293

Zhe Wang, Shaocong Xu, Xucai Zhuang, Tongda Xu, Yan Wang, Jingjing Liu, Yilun Chen, Ya-Qin Zhang CoopDETR: A Unified Cooperative Perception Framework for 3D Detection via Object Query https://arxiv.org/abs/2502.19313

Rui Li, Qianfen Jiao, Wenming Cao, Hau-San Wong, Si Wu Model Adaptation: Unsupervised Domain Adaptation without Source Data https://arxiv.org/abs/2502.19316

Danae S\'anchez Villegas, Ingo Ziegler, Desmond Elliott ImageChain: Advancing Sequential Image-to-Text Reasoning in Multimodal Large Language Models https://arxiv.org/abs/2502.19409

Rohit Saxena, Pasquale Minervini, Frank Keller PosterSum: A Multimodal Benchmark for Scientific Poster Summarization https://arxiv.org/abs/2502.17540

Cito Balsells, Beatrice Riviere, David Fuentes A Priori Generalizability Estimate for a CNN https://arxiv.org/abs/2502.17622