Day Number ID Title Authors Topic
Tuesday 1 194 mmPoint: Dense Human Point Cloud Generation from mmWave Qian Xie (University of Oxford)*; Qianyi Deng (University of Oxford); Ta-Ying Cheng (University of Oxford); Peijun Zhao (Massachusetts Institute of Technology); Amir Patel (University of Cape Town); Niki Trigoni (University of Oxford); Andrew Markham (University of Oxford) 3D from a single image and shape-from-x
2 356 Lightweight Self-Supervised Depth Estimation with few-beams LiDAR Data Rizhao Fan (University of Bologna)*; Fabio Tosi (University of Bologna); Matteo Poggi (University of Bologna); Stefano Mattoccia (University of Bologna) 3D from a single image and shape-from-x
3 174 Sparse Multi-Object Render-and-Compare Florian Maximilian Langer (Department of Engineering, University of Cambridge)*; Ignas Budvytis (Department of Engineering, University of Cambridge); Roberto Cipolla (University of Cambridge) 3D from a single image and shape-from-x
4 90 Floorplan Restoration by Structure Hallucinating Transformer Cascades Sepidehsadat Hosseini (Simon Fraser University)*; Yasutaka Furukawa (Simon Fraser University) 3D from multi-view and sensors
5 89 Strong Stereo Features for Self-Supervised Practical Stereo Matching Pierre-André Brousseau (Université de Montréal)*; Sebastien Roy (Universite de Montreal) 3D from multi-view and sensors
6 501 Temporal Lidar Depth Completion Pietari Kaskela (NVIDIA)*; Philipp Fischer (NVIDIA); Timo Roman (NVIDIA) 3D from multi-view and sensors
7 15 The Interstate-24 3D Dataset: a new benchmark for 3D multi-camera vehicle tracking Derek Gloudemans (Vanderbilt University)*; Daniel Work (Vanderbilt University); Yanbing Wang (Vanderbilt University); Gracie E Gumm (Vanderbilt University); William Barbour (Vanderbilt University) 3D from multi-view and sensors
8 448 Optimal Camera Configuration for Large-Scale Motion Capture Systems Xiongming Dai (louisiana state university)*; Gerald Baumgartner (Louisiana State University) 3D from multi-view and sensors
9 682 ManifoldNeRF: View-dependent Image Feature Supervision for Few-shot Neural Radiance Fields Daiju Kanaoka (Kyushu Institute of Technology)*; Motoharu Sonogashira (RIKEN); Hakaru Tamukoh (Kyushu Institute of Technology); Yasutomo Kawanishi (RIKEN) 3D from multi-view and sensors
10 741 Motion-Bias-Free Feature-Based SLAM Alejandro Fontan (Queensland University of Technology)*; Michael Milford (ACRV and QUT, Australia); Javier Civera (Universidad de Zaragoza) 3D from multi-view and sensors
11 825 RoomNeRF: Representing Empty Room as Neural Radiance Fields for View Synthesis Mangyu Kong (Yonsei University)*; Seongwon Lee (Yonsei university); Euntai Kim (Yonsei University) 3D from multi-view and sensors
12 304 Learning Part Motion of Articulated Objects Using Spatially Continuous Neural Implicit Representations Yushi Du (Peking University)*; Ruihai Wu (Peking University); Yan Shen (Peking University); Hao Dong (Peking University) 3D Shape modeling and processing
13 231 Propose-and-Complete: Auto-regressive Semantic Group Generation for Personalized Scene Synthesis Shoulong Zhang (Beihang University); Shuai Li (BeihangUniversity); Xinwei Huang (Beihang University); Wenchong Xu (Beihang University); Aimin Hao (BeihangUniversity); HONG QIN (Stony Brook University)* 3D Shape modeling and processing
14 306 Point Cloud Sampling Preserving Local Geometry for Surface Reconstruction Kohei Matsuzaki (KDDI Research, Inc.)*; Keisuke Nonaka (KDDI Research, Inc.) 3D Shape modeling and processing
15 417 Deformation-Guided Unsupervised Non-Rigid Shape Matching Aymen Merrouche (INRIA)*; Joao Pedro Cova Regateiro (Interdigital); Stefanie Wuhrer (Inria); Edmond Boyer (Inria) 3D Shape modeling and processing
16 820 Proposal-based Temporal Action Localization with Point-level Supervision Yuan Yin (Institute of Industrial Science, The University of Tokyo)*; Yifei Huang (The University of Tokyo); Ryosuke Furuta (The University of Tokyo); Yoichi Sato (University of Tokyo) Action and event understanding
17 335 Supervised Contrastive Learning with Identity-Label Embeddings for Facial Action Unit Recognition Tangzheng Lian (Nottingham Trent Univeristy); David A Adama (Nottingham Trent University); Pedro Machado (Nottingham Trent University); Doratha E Vinkemeier (NTU)* Action and event understanding
18 332 Learning Temporal Sentence Grounding From Narrated EgoVideos Kevin Flanagan (University of Bristol)*; Dima Damen (University of Bristol); Michael Wray (University of Bristol) Action and event understanding
19 739 Robust Principles: Architectural Design Principles for Adversarially Robust CNNs ShengYun Peng (Georgia Institute of Technology)*; Weilin Xu (Intel); Cory Cornelius (Intel Corporation); Matthew Hull (Georgia Institute of Technology); Kevin Li (Georgia Institute of Technology); Rahul Duggal (Georgia Tech); Mansi Phute (Georgia Institute of Technology); Jason Martin (Intel Corporation); Duen Horng Chau (Georgia Institute of Technology) Adversarial attack and defense
20 172 Backdoor Attack on Hash-based Image Retrieval via Clean-label Data Poisoning Kuofeng Gao (Tsinghua University)*; Jiawang Bai (Tsinghua University); Bin Chen (Harbin Institute of Technology, Shenzhen); Dongxian Wu (the University of Tokyo); Shu-Tao Xia (Tsinghua University) Adversarial attack and defense
21 406 Exploring Non-additive Randomness on ViT against Query-Based Black-Box Attacks Jindong Gu (University of Oxford)*; Fangyun Wei (Microsoft Research Asia); Philip Torr (University of Oxford); Han Hu (Microsoft Research Asia) Adversarial attack and defense
22 271 Semantic Adversarial Attacks via Diffusion Models Chenan Wang (Drexel University)*; Jinhao Duan (Drexel University); Chaowei Xiao (ASU); Edward Kim (Drexel University); Matthew c Stamm (Drexel University); Kaidi Xu (Drexel University) Adversarial attack and defense
23 296 RBFormer: Robust Bias Can Improve the Adversarial Robust of Transformer-based Structure Hao Cheng (The Hong Kong University of Science and Technology(Guangzhou))*; Jinhao Duan (Drexel University); Hui Li (Samsung Research and Development Institute China Xi'an); Lyutianyang Zhang (University of Washington); Jiahang Cao (The Hong Kong University of Science and Technology (Guangzhou)); Ping Wang (Xi'an Jiaotong University); Jize Zhang (HKUST); Kaidi Xu (Drexel University); Renjing Xu (The Hong Kong University of Science and Technology (Guangzhou)) Adversarial attack and defense
24 486 ADoPT: LiDAR Spoofing Attack Detection based on Point-Level Temporal Consistency Minkyoung Cho (University of Michigan)*; Yulong Cao (Nvidia); Zixiang Zhou (University of Michigan); Zhuoqing Morley Mao (University of Michigan) Adversarial attack and defense
25 620 Unifying the Harmonic Analysis of Adversarial Attacks and Robustness Shishira R R Maiya (University of Maryland)*; Max Ehrlich (NVIDIA); Vatsal Agarwal (University of Maryland); Ser-Nam Lim (Meta AI); Tom Goldstein (University of Maryland); Abhinav Shrivastava (University of Maryland) Adversarial attack and defense
26 781 Adaptive Adversarial Norm Space for Efficient Adversarial Training Hui Kuurila-Zhang (University of Oulu)*; Haoyu Chen (University of Oulu); Guoying Zhao (University of Oulu) Adversarial attack and defense
27 382 Fully Quantum Auto-Encoding of 3D Shapes Lakshika Rathi (Indian Institute of Technology Delhi); Edith Tretschk (Max-Planck-Institut für Informatik)*; Christian Theobalt (MPI Informatik); Rishabh Dabral (IIT Bombay); Vladislav Golyanik (MPI for Informatics) Brave new ideas
28 677 MFSC: Matching by Few-Shot Classification Daniel Shalam (University of Haifa); Elie Abboud (University of Haifa); Roee Litman (-); Simon Korman (University of Haifa)* Brave new ideas
29 822 Differentiable SLAM Helps Deep Learning-based LiDAR Perception Tasks Prashant Kumar (Indian Institute of Technology, Delhi)*; Dheeraj Vattikonda (McGill University); Vedang Bhupesh Shenvi Nadkarni (Birla Institute of Technology and Science, Pilani); Erqun Dong (McGill University); Sabyasachi Sahoo (Université Laval, Mila) Brave new ideas
30 643 Color Constancy: How to Deal with Camera Bias? Yi-Tun Lin (University of East Anglia)*; Bianjiang Yang (Purdue University); Hao Xie (Meta Platforms, Inc.); Wenbin Wang (Meta); Honghong Peng (Meta); JUN HU (Apple Inc) Computational Photography
31 348 RGB and LUT based Cross Attention Network for Image Enhancement Tengfei Shi (Beihang University); Chenglizhao Chen (China University of Petroleum (East China))*; Yuanbo He (State Key Laboratory of Virtual Reality Technology and Systems, Beihang University); wenfeng song (Beijing Information Science and Technology University); Aimin Hao (BeihangUniversity) Computational Photography
32 315 Generalized Imaging Augmentation via Linear Optimization of Neurons Daoyu Li (Beijing Institute of Technology); Lu Li (Beijing Institute of Technology); Bin Li (Beijing University of Posts and Telecommunications); Liheng Bian (Beijing Institute of Technology)* Computational Photography
33 765 Reconstructing Synthetic Lensless Images in the Low-Data Regime Abeer Banerjee (CSIR-CEERI)*; Himanshu Kumar (CSIR-CEERI); Sumeet Saurav (CSIR-CEERI); Sanjay Singh (CSIR-CEERI, Pilani ) Computational Photography
34 286 Lightweight Image Super-Resolution with Scale-wise Network Xiaole Zhao (School of Computing and Artificial Intelligence, Southwest Jiaotong University); Xinkun Wu (School of Computing and Artificial Intelligence, Southwest Jiaotong University)* Computer vision theory
35 540 Sketch-based Video Object Segmentation: Benchmark and Analysis Ruolin Yang (Beijing University of Posts and Telecommunications)*; Da Li (Samsung); Conghui Hu (National University of Singapore); Timothy Hospedales (Edinburgh University); Honggang Zhang (Beijing University of Posts and Telecommunications); Yi-Zhe Song (University of Surrey) Datasets and Evaluation
36 870 Data exploitation: multi-task learning of object detection and semantic segmentation on partially annotated data Hoàng-Ân Lê (IRISA, University of South Brittany)*; Minh-Tan Pham (IRISA-UBS) Datasets and Evaluation
37 235 What Should be Balanced in a “Balanced” Dataset? Haiyu Wu (University of Notre Dame)*; Kevin Bowyer (University of Notre Dame) Datasets and Evaluation
38 127 SynthBlink and BlinkFormer: A Synthetic Dataset and Transformer-Based Method for Video Blink Detection Bo Liu (Beihang University); Yang Xu (Beihang University); Feng Lu (Beihang University)* Datasets and Evaluation
39 743 A Comprehensive Crossroad Camera Dataset to Improve Traffic Safety of Mobility Aid Users Ludwig Mohr (Institute of Computer Graphics and Vision, Graz University of Technology)*; Nadezda Kirillova (Graz University of Technology); Horst Possegger (Graz University of Technology); Horst Bischof (Graz University of Technology) Datasets and Evaluation
40 752 Learnable Data Augmentation for One-Shot Unsupervised Domain Adaptation Julio Ivan Davila Carrazco (Istituto Italiano di Tecnologia)*; Pietro Morerio (Istituto Italiano di Tecnologia); Alessio Del Bue (Istituto Italiano di Tecnologia (IIT)); Vittorio Murino (Istituto Italiano di Tecnologia) Deep learning architectures and techniques
41 660 G2N2: Lightweight Event Stream Classification with GRU Graph Neural Networks Thomas Mesquida (CEA LIST)*; Manon Dampfhoffer (SPINTEC University Grenoble Alpes); Thomas Dalgaty (CEA List); Pascal Vivet (CEA-LIST); Amos Sironi (PROPHESEE); Christoph Posch (PROPHESEE) Deep learning architectures and techniques
42 709 Momentum Adapt: Robust Unsupervised Adaptation for Improving Temporal Consistency in Video Semantic Segmentation During Test-Time Amirhossein Hassankhani (Tampere University)*; Hamed Rezazadegan Tavakoli (Nokia Technologies); Esa Rahtu (Tampere University) Deep learning architectures and techniques
43 123 Knowledge Distillation Layer that Lets the Student Decide Ada Gorgun (Middle East Technical University)*; Yeti Z. Gurbuz (Tecnische Universitat Berlin); Aydin Alatan (Middle East Technical University, Turkey) Deep learning architectures and techniques
44 237 Spatio-Temporal MLP-Graph Network for 3D Human Pose Estimation Md. Tanvir Hassan (Concordia University); Abdessamad Ben Hamza (Concordia University)* Deep learning architectures and techniques
45 360 FLRKD: Relational Knowledge Distillation Based on Channel-wise Feature Quality Assessment Zeyu An (University of Electronic Science and Technology of China)*; Changjian Deng (University of Electronic Science and Technology of China); Wanli Dang (University of Electronic Science and Technology of China;The Second Research Institute of the Civil Aviation Administration of China); Zhicheng Dong (Tibet university); 谦 罗 (中国民用航空总局第二研究所); Jian Cheng (University of Electronic Science and Technology of China) Deep learning architectures and techniques
46 792 Budding Ensemble Architecture: Revisiting anchor-based object detection DNN Qutub Syed (INTEL LABS)*; Neslihan Kose Cihangir (Intel Deutschland GmbH); Rafael Rosales (Intel); Michael Paulitsch (Intel); Korbinian Hagn (Intel); Florian R Geissler (Intel); Yang Peng (Intel); Gereon Hinz (STTech GmbH); Alois C. Knoll (Robotics and Embedded Systems) Deep learning architectures and techniques
47 893 VADOR: Real World Video Anomaly Detection with Object Relations and Action Halil İbrahim Öztürk (Togg)*; Ahmet Burak Can (Hacettepe University) Deep learning architectures and techniques
48 911 Masked Attention ConvNeXt Unet with Multi-Synthetic Dynamic Weighting for Anomaly Detection and Localization SHIH CHIH LIN (National Tsing Hua University)*; Ho Weng Lee (National Tsing Hua University); Yu-Shuan Hsieh (National Tsing Hua University); Cheng Yu Ho (National Tsing Hua University); Shang-Hong Lai (National Tsing Hua University) Deep learning architectures and techniques
49 204 Cardiac Landmark Detection using Generative Adversarial Networks from Cardiac MR Images Aparna Kanakatte (TCS)*; DIVYA M BHATIA (TCS); Pavan Kumar Reddy K (TCS Research); Jayavardhana Gubbi (TCS Research); Avik Ghose (TCS) Deep learning architectures and techniques
50 514 DFFG: Fast Gradient Iteration for Data-free Quantization huixing leng (Beihang University); shuangkang fang (megvii,buaa); Yufeng Wang (Beihang University)*; Zehao ZHANG (beihang university); Qi Dacheng (Beijing Jiaotong University); Wenrui Ding (Beihang University) Deep learning architectures and techniques
51 522 Train ViT on Small Dataset With Translation Perceptibility CHEN HUAN (Institute of Computing Technology)*; Ping Yao (Institute of Computing Technology, Chinese Academy of Sciences ); WENTAO WEI (Southeast University) Deep learning architectures and techniques
52 665 Distillation for High-Quality Knowledge Extraction via Explainable Oracle Approach MyungHak Lee (Kookmin University)*; Wooseong Syz Cho (Kookmin University); Sungsik Kim (Kookmin University); Jinkyu Kim (Korea University); Jaekoo Lee (Kookmin University) Deep learning architectures and techniques
53 846 Topology-Preserving Hard Pixel Mining for Tubular Structure Segmentation Guoqing Zhang (Tsinghua-Berkeley Shenzhen Institute, Tsinghua University)*; Caixia Dong (The Second Affilated Hospital of Xi'an Jiaotong University); Yang Li (Tsinghua-Berkeley Shenzhen Institute, Tsinghua University) Deep learning architectures and techniques
54 854 A Forward-backward Learning strategy for CNNs via Separation Index Maximizing at the First Convolutional Layer Ali Karimi (University of Tehran); Ahmad Kalhor (University of Tehran)*; Mona Ahmadian (University of Surrey) Deep learning architectures and techniques
55 214 Understanding Gaussian Attention Bias of Vision Transformers Using Effective Receptive Fields Bum Jun Kim (POSTECH); Hyeyeon Choi (POSTECH); Hyeonah Jang (POSTECH); Sang Woo Kim (POSTECH)* Deep learning architectures and techniques
56 295 LOCATE: Self-supervised Object Discovery via Flow-guided Graph-cut and Bootstrapped Self-training Silky Singh (Adobe Systems)*; Shripad V Deshmukh (Adobe); Mausoom Sarkar (Adobe); Balaji Krishnamurthy () Deep learning architectures and techniques
57 562 Fiducial Focus Augmentation for Facial Landmark Detection Purbayan Kar (Sony Research India); Vishal M Chudasama (Sony Research India); Naoyuki Onoe (Sony); Pankaj Wasnik (Sony Research India)*; Vineeth Balasubramanian (Indian Institute of Technology Hyderabad) Deep learning architectures and techniques
58 268 Lips-SpecFormer: Non-Linear Interpolable Transformer for Spectral Reconstruction using Adjacent Channel Coupling Abhishek Kumar Sinha (Indian Space Research Organization)*; Manthira Moorthi S (ISRO) Deep learning architectures and techniques
59 521 Selective Scene Text Removal Hayato Mitani (Kyushu University)*; Akisato Kimura (NTT Corporation); Seiichi Uchida (Kyushu University) Document analysis and understanding
60 7 HWD: A Novel Evaluation Score for Styled Handwritten Text Generation Vittorio Pippi (University of Modena and Reggio Emilia)*; Fabio Quattrini (University of Modena and Reggio Emilia); Silvia Cascianelli (Università di Modena e Reggio Emilia); Rita Cucchiara (Università di Modena e Reggio Emilia) Document analysis and understanding
61 511 McQueen: Mixed Precision Quantization of Early Exit Networks Utkarsh Saxena (Purdue University)*; Kaushik Roy (Purdue Uniiversity) Efficient and scalable vision
62 345 Region-aware Knowledge Distillation for Efficient Image-to-Image Translation Linfeng Zhang (Tsinghua University )*; Xin Chen (Intel Corp.); Runpei Dong (Xi'an Jiaotong University); Kaisheng Ma (Tsinghua University ) Efficient and scalable vision
63 744 CoordGate: Efficiently Computing Spatially-Varying Convolutions in Convolutional Neural Networks Sunny Howard (University of Oxford)*; Peter Norreys (University of Oxford); Andreas Döpp (LMU Munich) Efficient and scalable vision
64 307 RUPQ: Improving low-bit quantization by equalizing relative updates of quantization parameters Valentin Buchnev (Huawei Technologies Co. Ltd.)*; Jiao He (huawei company); Fengyu Sun (Huawei); Ivan Koryakovskiy (Huawei Technologies Co., Ltd.) Efficient and scalable vision
65 889 DeepliteRT: Computer Vision at the Edge Saad Ashfaq (Deeplite)*; Alexander Hoffman (McGill University); SAPTARSHI MITRA (Deeplite Inc.); Ehsan Saboori (Deeplite Inc.); Sudhakar Sah (Deeplite Inc); MohammadHossein AskariHemmat (Polytechnique Montreal) Efficient and scalable vision
Wednesday 66 622 Teaching AI to Teach: Leveraging Limited Human Salience Data Into Unlimited Saliency-Based Training Colton R Crum (University of Notre Dame)*; Aidan Boyd (University of Notre Dame); Kevin Bowyer (University of Notre Dame); Adam Czajka (University of Notre Dame) Biometrics
67 498 CERiL: Continuous Event-based Reinforcement Learning Celyn Walters (University of Surrey); Simon Hadfield (University of Surrey)* Embodied vision: Active agents; simulation
68 703 Foveation in the Era of Deep Learning Gerardo Aragon-Camarasa (University of Glasgow); George W Killick (University of Glasgow)*; Paul Henderson (University of Glasgow); Jan Paul Siebert (University of Glasgow) Embodied vision: Active agents; simulation
69 828 SCAAT: Improving Neural Network Interpretability via Saliency Constrained Adaptive Adversarial Training Rui Xu (Peking University); Wenkang Qin (Peking University); Peixiang Huang (Peking University); Hao Wang (National Institutes for Food and Drug Control); Lin Luo (Peking University)* Explainable AI
70 188 Diverse Explanations for Object Detectors with Nesterov-Accelerated iGOS++ Mingqi Jiang (Oregon State University)*; Saeed Khorram (Oregon State University); Li Fuxin (Oregon State University) Explainable AI
71 40 Learning a Pedestrian Social Behavior Dictionary Faith M Johnson (Rutgers University)*; Kristin Dana (Rutgers University) Explainable AI
72 207 Embedding Human Knowledge into Spatio-Temproal Attention Branch Network in Video Recognition via Temporal attention Saki Noguchi (Chubu University)*; Yuzhi Shi ( Chubu University); Tsubasa Hirakawa (Chubu University); Takayoshi Yamashita (Chubu University); Hironobu Fujiyoshi (Chubu University) Explainable AI
73 761 Laughing Matters: Introducing Audio-Driven Laughing-Face Generation with Diffusion Models Antoni Bigata Casademunt (Imperial College London)*; Rodrigo Mira (Imperial College London); Nikita Drobyshev (Imperial College London); Konstantinos Vougioukas (Imperial College London); Stavros Petridis (Imperial College London); Maja Pantic (Facebook / Imperial College London ) Faces and gestures
74 146 Learning Separable Hidden Unit Contributions for Speaker-Adaptive Visual Speech Recognition Songtao Luo (Institute of Computing Technology, Chinese Academy of Sciences)*; Shuang Yang (ICT, CAS); Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences); Xilin Chen (Institute of Computing Technology, Chinese Academy of Sciences) Faces and gestures
75 190 UniLip: Learning Visual-Textual Mapping with Uni-Modal Data for Lip Reading Bingquan Xia (Institute of Computing Technology, Chinese Academy of Sciences)*; Shuang Yang (ICT, CAS); Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences); Xilin Chen (Institute of Computing Technology, Chinese Academy of Sciences) Faces and gestures
76 470 KFC: Kinship Verification with Fair Contrastive loss and Multi-Task Learning Jia Luo Peng (National Tsing Hua University)*; Keng Wei Chang (National Tsing Hua University); Shang-Hong Lai (National Tsing Hua University) Faces and gestures
77 98 Prompting Visual-Language Models for Dynamic Facial Expression Recognition Zengqun Zhao (Queen Mary University of London)*; Ioannis Patras (Queen Mary University of London) Faces and gestures
78 287 EventFormer: AU Event Transformer for Facial Action Unit Event Detection Yingjie Chen (Peking University)*; Jiarui Zhang (Peking University); Tao Wang (Peking University); Yun Liang (Peking University) Faces and gestures
79 230 De-identification of facial videos while preserving remote physiological utility Marko Radisa Savic (University of Oulu)*; Guoying Zhao (University of Oulu) Fairness, privacy, ethics, social-good, transparency, accountability in vision
80 629 Biased Attention: Do Vision Transformers Amplify Gender Bias More than Convolutional Neural Networks? Abhishek Mandal (Dublin City University)*; Susan Leavy (University College Dublin); Suzanne Little (Dublin City University, Ireland) Fairness, privacy, ethics, social-good, transparency, accountability in vision
81 799 Discriminative Adversarial Privacy: Balancing Accuracy and Membership Privacy in Neural Networks Eugenio Lomurno (Politecnico di Milano)*; Alberto Archetti (Politecnico di Milano); Francesca Ausonio (Politecnico di Milano); Matteo Matteucci (Politecnico di Milano) Fairness, privacy, ethics, social-good, transparency, accountability in vision
82 143 Sparse and Privacy-enhanced Representation for Human Pose Estimation Ting-Ying Lin (National Tsing Hua University)*; Lin-Yung Hsieh (National Tsing Hua University); Fu-En Wang (National Tsing Hua University); Wen-Shen Wuen (Novatek Microelectronics Corp.); Min Sun (NTHU) Human pose/shape estimation
83 763 BoIR: Box-Supervised Instance Representation for Multi Person Pose Estimation Uyoung Jeong (Ulsan National Institute of Science and Technology)*; Seungryul Baek (UNIST); Hyung Jin Chang (University of Birmingham); Kwang In Kim (POSTECH) Human pose/shape estimation
84 664 Stream-based Active Learning by Exploiting Temporal Properties in Perception with Temporal Predicted Loss Sebastian Schmidt (BMW)*; Stephan Günnemann (Technical University of Munich) Human-in-the-loop computer vision
85 798 Cascade Sparse Feature Propagation Network for Interactive Segmentation Chuyu Zhang (PLUS Lab, Shanghaitech University)*; Hui Ren (ShanghaiTech); ChuanYang Hu (PLUS Lab, Shanghaitech University); Yongfei Liu (ShanghaiTech); Xuming He (ShanghaiTech University) Human-in-the-loop computer vision
86 337 Active Learning for Fine-Grained Sketch-Based Image Retrieval Himanshu Thakur (Carnegie Mellon University); Soumitri Chattopadhyay (Jadavpur University)* Human-in-the-loop computer vision
87 718 Self-supervised Adversarial Training for Robust Face Forgery Detection Yueying Gao (Communication University of China)*; Weiguo Lin (Communication University of China); junfeng xu (Communication University of China); Wanshan Xu (Communication University of China); Peibin Chen (Communication University of China) Image and video forensics
88 379 Test-Time Adaptation for Robust Face Anti-Spoofing Pei-Kai Huang (National Tsing Hua University)*; Chen-Yu Lu (National Tsing Hua University); Shu-Jung Chang (National Tsing Hua University); Jun-Xiong Chong (National Tsing Hua University); Chiou-Ting Hsu (National Tsing Hua University) Image and video forensics
89 659 Open Set Synthetic Image Source Attribution Shengbang Fang (Drexel University)*; Tai D Nguyen (Drexel University); Matthew c Stamm (Drexel University) Image and video forensics
90 595 Face Aging via Diffusion-based Editing Xiangyi Chen (Télécom Paris, Shanghai Jiao Tong University)*; Stéphane Lathuilière (Telecom-Paris) Image and Video Synthesis
91 236 Temporal-controlled Frame Swap for Generating High-Fidelity Stereo Driving Data for Autonomy Analysis Yedi Luo (Northeastern University); Xiangyu Bai (Northeastern University); Jiang Le (Northeastern University ); Aniket Gupta (Northeastern University); Eric C Mortin (US Army DEVCOM Analysis Center); Hanumant Singh (Northeastern University); Sarah Ostadabbas (Northeastern University)* Image and Video Synthesis
92 16 VETIM: Expanding the Vocabulary of Text-to-Image Models only with Text Martin Nicolas Everaert (EPFL)*; Radhakrishna Achanta (EPFL); Marco Bocchio (Largo.ai); Sami Arpa (Largo.ai); Sabine Süsstrunk (EPFL) Image and Video Synthesis
93 258 A Structure-Guided Diffusion Model for Large-Hole Image Completion Daichi Horita (The University of Tokyo)*; Jiaolong Yang (Microsoft Research); Dong Chen (Microsoft Research Asia); Yuki Koyama (National Institute of Advanced Industrial Science and Technology (AIST)); Kiyoharu Aizawa (The University of Tokyo); Nicu Sebe (University of Trento) Image and Video Synthesis
94 103 Video Infilling with Rich Motion Prior Xinyu Hou (Nanyang Technological University)*; Liming Jiang (Nanyang Technological University); Rui Shao (Harbin Institute of Technology (Shenzhen)); Chen Change Loy (Nanyang Technological University) Image and Video Synthesis
95 274 Frequency-consistent Optimization for Image Enhancement Networks Bing Li (University of Science and Technology of China)*; Naishan Zheng (University of Science and Technology of China); Qi Zhu (University of Science and Technology of China); Jie Huang (University of Science and Technology of China); Feng Zhao (University of Science and Technology of China) Low-level and Physics-based Vision
96 46 Joint Low-light Enhancement and Super Resolution with Image Underexposure Level Guidance Mingjie Xu (Beihang University); Chaoqun Zhuang (Beihang University); Feifan Lv (Beihang University); Feng Lu (Beihang University)* Low-level and Physics-based Vision
97 674 Estimating Absorption Coefficient from a Single Image via Entropy Minimization Junya Katahira (Kyushu Institute of Technology); Ryo Kawahara (Kyushu Institute of Technology); Takahiro Okabe (Kyushu Institute of Technology)* Low-level and Physics-based Vision
98 149 Five A+ Network: You Only Need 9K Parameters for Underwater Image Enhancement JingXia Jiang (jimei university); Tian Ye (The Hong Kong University of Science and Technology (Guangzhou))*; Sixiang Chen (The Hong Kong University of Science and Technology (Guangzhou)); Erkang Chen (Jimei University); Yun Liu (Southwest University); Shi Jun (XinJiang University); Jinbin Bai (Nanjing University); Wenhao Chai (University of Washington) Low-level and Physics-based Vision
99 775 Towards Clip-Free Quantized Super-Resolution Networks: How to Tame Representative Images Alperen Kalay (Aselsan Research)*; Bahri Batuhan Bilecen (Aselsan Research); Mustafa Ayazoglu (Aselsan Research) Low-level and Physics-based Vision
100 358 RawSeg: Grid Spatial and Spectral Attended Semantic Segmentation Based on Raw Bayer Images Guoyu Lu (University of Georgia)* Low-level and Physics-based Vision
101 635 Log RGB Images Provide Invariance to Intensity and Color Balance Variation for Convolutional Networks Bruce A Maxwell (Northeastern University)*; Sumegha Singhania (Northeastern University); Heather Fryling (Northeastern University); Haonan Sun (Northeastern University) Low-level and Physics-based Vision
102 283 MG-MLP: Multi-gated MLP for Restoring Images from Spatially Variant Degradations Jaihyun Koh (Samsung Display)*; Jaihyun Lew (Seoul National University); Jangho Lee (Incheon National University); Sungroh Yoon (Seoul National University) Low-level and Physics-based Vision
103 538 Learning Disentangled Representations for Environment Inference in Out-of-distribution Generalization Dongqi Li (Beijing Jiaotong University); Zhu Teng (Beijing Jiaotong University); Li Qirui (AFCtech); Wang Ziyin (AFCtech); Baopeng Zhang (BJTU)*; Jianping Fan (Lenovo) Machine learning (other than deep learning)
104 750 A2V: A Semi-Supervised Domain Adaptation Framework for Brain Vessel Segmentation via Two-Phase Training Angiography-to-Venography Translation Francesco Galati (EURECOM)*; Daniele Falcetta (EURECOM); Rosa Cortese (University of Siena); Barbara Casolla (CHU Nice); Ferran Prados (University College London); Ninon Burgos (CNRS - Paris Brain Institute); Maria A. Zuluaga (EURECOM) Medical and biological vision; cell microscopy
105 409 AGMDT: Virtual Staining of Renal Histology Images with Adjacency-Guided Multi-Domain Transfer Tao Ma (Peking University)*; Chao Zhang (Peking University); MIN LU (Peking University Health Science Center); Lin Luo (Peking University) Medical and biological vision; cell microscopy
106 84 Spatial and Planar Consistency for Semi-Supervised Volumetric Medical Image Segmentation Yanfeng Zhou (Institute of Automation, Chinese Academy of Sciences); yiming huang (nstitute of Automation,Chinese Academy of Sciences); Ge Yang (National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences)* Medical and biological vision; cell microscopy
107 699 Variational Autoencoders for Feature Exploration and Malignancy Prediction of Lung Lesions Ben Keel (University of Leeds)*; Samuel D. Relton (University of Leeds); Aaron Quyn (University of Leeds); David Jayne (University of Leeds) Medical and biological vision; cell microscopy
108 99 ReSynthDetect: A Fundus Anomaly Detection Network with Reconstruction and Synthetic Features Jingqi Niu (Shanghai Jiaotong University)*; Qinji Yu (Shanghai Jiao Tong University); Shiwen Dong (Shanghai Jiao Tong University); Zilong Wang (Voxelcloud); Kang Dang (Voxelcloud Inc); xiaowei ding (Shanghai Jiao Tong University) Medical and biological vision; cell microscopy
109 411 LACFormer: Toward accurate and efficient polyp segmentation Quan Van Nguyen (R&D Lab, Sun Asterisk Inc Vietnam)*; Mai Nguyen ( R&D Lab, Sun Asterisk Inc Vietnam); Thanh Tung Nguyen (Sun Asterisk Vietnam); Huy Trịnh Quang (Sun-asterisk); Toan Pham Van (R&D Lab, Sun Asterisk Inc Vietnam); Linh Bao Doan (Sun* Inc.) Medical and biological vision; cell microscopy
110 789 Multi-Stain Self-Attention Graph Multiple Instance Learning Pipeline for Histopathology Whole Slide Images Amaya Gallagher-Syed (Queen Mary University of London)*; Luca Rossi (The Hong Kong Polytechnic University); Felice Rivellese (Queen Mary University of London); Costantino Pitzalis (Queen Mary University of London); Myles Lewis (Queen Mary University of London); Michael Barnes (Queen Mary University of London); Gregory Slabaugh (Queen Mary University of London) Medical and biological vision; cell microscopy
111 462 Enhance Regional Wall Segmentation by Style Transfer for Regional Wall Motion Assessment Kaikai Liu (Northwest A&F University); Yiyu Shi (University of Notre Dame); Jian Zhuang (Guangdong Provincial People's Hospital); Meiping Huang (Guangdong Provincial People's Hospital); Hongwen Fei (Guangdong Provincial People's Hospital); Boyang Li (Meta); Jin Hong (Guangdong Provincial People's Hospital); Qing Lu (University of Notre Dame); Erlei Zhang (Northwest A&F University); Xiaowei Xu (Guangdong Provincial People's Hospital)* Medical and biological vision; cell microscopy
112 467 Cross-Modal Attention for Accurate Pedestrian Trajectory Prediction Mayssa ZAIER (IMT NORD EUROPE)*; Hazem Wannous (University of Lille); Hassen Drira (University of Strasbourg); Jacques boonaert (imt lille douai) Motion estimation and tracking
113 603 READMem: Robust Embedding Association for a Diverse Memory in Unconstrained Video Object Segmentation Stephane Vujasinovic (Fraunhofer IOSB)*; Sebastian W Bullinger (Fraunhofer IOSB); Stefan Becker (Fraunhofer IOSB); Norbert Scherer-Negenborn (Fraunhofer IOSB); Michael Arens (Fraunhofer IOSB); Rainer Stiefelhagen (Karlsruhe Institute of Technology) Motion estimation and tracking
114 441 EgoFlowNet: Non-Rigid Scene Flow from Point Clouds with Ego-Motion Support Ramy Battrawy (DFKI)*; René Schuster (DFKI); Didier Stricker (DFKI) Motion estimation and tracking
115 813 Learning Tri-modal Embeddings for Zero-Shot Soundscape Mapping Subash Khanal (Washington University in Saint Louis)*; Srikumar Sastry (Washington University in St. Louis); Aayush Dhakal (Washington University in St Louis); Nathan Jacobs (Washington University in St. Louis) Multimodal learning
116 624 Text-to-Motion Synthesis using Discrete Diffusion Model Ankur Chemburkar (USC Institute for Creative Technologies)*; Shuhong Lu (USC Institute for Creative Technologies); Andrew Feng (USC Institute for Creative Technologies) Multimodal learning
117 460 AMA: Adaptive Memory Augmentation for Enhancing Image Captioning Shuang Cheng (Institute of Computing Technology, Chinese Academy of Sciences)*; Jian Ye (Institute of Computing Technology, CAS) Multimodal learning
118 326 X-PDNet: Accurate Joint Plane Instance Segmentation and Monocular Depth Estimation with Cross-Task Attention and Boundary Correction Duc Cao Dinh (Computer Vision Lab, Hanyang University)*; Jongwoo Lim (Hanyang University) Multimodal learning
119 381 Zero-shot Composed Text-Image Retrieval Yikun Liu (Beijing University of Posts and Telecommunications)*; Jiangchao Yao (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University); Yan-Feng Wang (Cooperative medianet innovation center of Shanghai Jiao Tong University); Ya Zhang (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University); Weidi Xie (Shanghai Jiao Tong University) Multimodal learning
120 666 E2SAM: A Pipeline for Efficiently Extending SAM's Capability on Cross-Modality Data via Knowledge Inheritance Su sundingkai (Beijing University of Posts and Telecommunications); Mengqiu Xu (Beijing University of Posts and Telecommunications); Kaixin Chen (Beijing University of Posts and Telecommunications); Ming Wu (Beijing University of Posts and Telecommunications)*; Chuang Zhang (Beijing University of Posts and Telecommunications) Multimodal learning
121 478 Conditional Generation from Pre-Trained Diffusion Models using Denoiser Representations Alexandros Graikos (Stony Brook University)*; Srikar Yellapragada (Stony Brook University); Dimitris Samaras (Stony Brook University) Neural generative models
122 715 Bridging the Gap: Enhancing the Utility of Synthetic Data via Post-Processing Techniques Eugenio Lomurno (Politecnico di Milano)*; Andrea Lampis (Politecnico di Milano); Matteo Matteucci (Politecnico di Milano) Neural generative models
123 787 TD-GEM: Text-Driven Garment Editing Mapper Reza Dadfar (KTH); Sanaz Sabzevari (KTH University)*; Marten Bjorkman (KTH); Danica Kragic (KTH Royal Institute of Technology) Neural generative models
124 243 Class-Continuous Conditional Generative Neural Radiance Field Jiwook Kim (Chung-Ang University)*; Minhyeok Lee (Chung-Ang University) Neural generative models
125 22 Locality-Aware Hyperspectral Classification Fangqin Zhou (Technology University of Eindhoven); Mert Kilickaya (Eindhoven University of Technology)*; Joaquin Vanschoren (Eindhoven University of Technology) Photogrammetry and remote sensing
126 4 Instance Mask Growing on Leaf Chuang Yang (Northwestern Polytechnical University); Haozhao Ma (Northwestern Polytechnical University); Qi Wang (Northwestern Polytechnical University)* Recognition: Categorization and Instance recognition
127 135 Infinite Class Mixup Thomas Mensink (Google Research); Pascal Mettes (University of Amsterdam)* Recognition: Categorization and Instance recognition
128 375 Building A Mobile Text Recognizer via Truncated SVD-based Knowledge Distillation-Guided NAS Weifeng Lin (South China University of Technology); Canyu Xie (South China University of Technology); Dezhi Peng (South China University of Technology); Jiapeng Wang (South China University of Technology); Lianwen Jin (South China University of Technology)*; Wei Ding (Alibaba Group); Cong Yao (Alibaba DAMO Academy); Mengchao He (DAMO Academy, Alibaba Group) Recognition: Categorization and Instance recognition
129 119 Integrating Transient and Long-term Physical States for Depression Intelligent Diagnosis Ke Wu (Beihang University); Han Jiang (Beihang University)*; Li Kuang (Beihang University); Yixuan Wang (Beihang University); Huaiqian Ye (Beihang University); Yuanbo He (State Key Laboratory of Virtual Reality Technology and Systems, Beihang University) Recognition: Categorization and Instance recognition
130 320 Learning Unified Representations for Multi-Resolution Face Recognition Hulingxiao He (School of Automation,Beijing Institute of Technology); Wu Yuan (School of Computer Science,Beijing Institute of Technology)*; Yidian Huang (Beijing Institute of Technology); Shilong Zhao (Beijing Institute of Technology); Wen Yuan (State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, CAS); Hanqing Li (University of the Chinese Academy of Sciences) Recognition: Categorization and Instance recognition
131 102 ReCoT: Regularized Co-Training for Facial Action Unit Recognition with Noisy Labels Yifan Li (Michigan State University); Hu Han (Institute of Computing Technology, Chinese Academy of Sciences)*; Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences); zhilong ji (Tomorrow Advancing Life); Jinfeng Bai (Tomorrow Advance Life); Xilin Chen (Institute of Computing Technology, Chinese Academy of Sciences) Recognition: Categorization and Instance recognition
Thursday 132 272 SMPLitex: A Generative Model and Dataset for 3D Human Texture Estimation from Single Image Dan Casas (Universidad Rey Juan Carlos)*; Marc Comino Trinidad (Universidad Rey Juan Carlos) 3D from a single image and shape-from-x
133 800 Mobile Vision Transformer-based Visual Object Tracking Goutam Yelluru Gopal (Concordia University)*; Maria Amer (Concordia University) Object pose estimation and tracking
134 444 Semi-Supervised Domain Generalization for Detection via Language-Guided Feature Alignment Sina Malakouti (University of Pittsburgh)*; Adriana Kovashka (University of Pittsburgh) Recognition: Detection
135 94 Likelihood-based Out-of-Distribution Detection with Denoising Diffusion Probabilistic Models Joseph S Goodier (University of Bath)*; Neill Campbell (University of Bath) Recognition: Detection
136 323 Point-to-RBox Network for Oriented Object Detection via Single Point Supervision Yucheng Wang (WuHan University)*; Chu He (Wuhan University); Xi Chen (Wuhan university) Recognition: Detection
137 310 Widely Applicable Strong Baseline for Sports Ball Detection and Tracking Shuhei Tarashima (NTT Communications Corporation)*; Norio Tagawa (Tokyo Metropolitan University); Muhammad Abdul Haq (Tokyo Metropolitan University); Wang Yushan (Tokyo Metropolitan University) Recognition: Detection
138 93 Open-Vocabulary Object Detection with Meta Prompt Representation and Instance Contrastive Optimization Zhao Wang (The Chinese University of Hong Kong)*; Aoxue Li (Noah's Ark Lab); Fengwei Zhou (Huawei Noah's Ark Lab); Zhenguo Li (Huawei Noah's Ark Lab); DOU QI (The Chinese University of Hong Kong) Recognition: Detection
139 707 SWIN-RIND: Edge Detection for Reflectance, Illumination, Normal and Depth Discontinuity with Swin Transformer LUN MIAO (The University of Tokyo)*; Takeshi Oishi (The University of Tokyo); Ryoichi Ishikawa (The university of Tokyo) Recognition: Detection
140 33 Scale Adaptive Network for Partial Person Re-identification: Counteracting Scale Variance HongYu Chen (Northwestern Polytechnical University)*; BingLiang Jiao (Northwestern Polytechnical University ); Liying Gao ( Northwestern Polytechnical University); Peng Wang (Northwestern Polytechnical University) Recognition: Retrieval
141 608 Object-Centric Open-Vocabulary Image-Retrieval with Sparse Features Hila Levi (General Motors)*; Guy Heller (General Motors); Dan Levi (General Motors); Ethan Fetaya (Bar Ilan University) Recognition: Retrieval
142 353 Adapting Self-Supervised Representations to Multi-Domain Setups Neha Kalibhat (University of Maryland - College Park)*; Sam Sharpe (Capital One); Jeremy Goodsitt (Capital One); C. Bayan Bruss (Capital One); Soheil Feizi (University of Maryland) Representation Learning
143 111 SeqCo-DETR: Sequence Consistency Training for Self-Supervised Object Detection with Transformers Guoqiang Jin (SenseTime Research)*; Fan Yang (中国科学院自动化研究所); Mingshan Sun (SenseTime Research ); Ruyi Zhao (Tongji University); Yakun Liu (SenseTime Research); Wei Li (SenseTime Research); Tianpeng Bao (SenseTime Research); Liwei Wu (SenseTime Research); Xingyu ZENG (SenseTime Group Limited); Rui Zhao (SenseTime Group Limited) Representation Learning
144 433 Variational Autoencoders with Decremental Information Bottleneck for Disentanglement Jiantao Wu (University of Surrey)*; Shentong Mo (Carnegie Mellon University); Xingshen Zhang (University of Jinan); Muhammad Awais (University of Surrey); Sara Ahmed (University of surrey); Zhenhua Feng (University of Surrey); Lin Wang (University of Jinan); Xiang Yang (Zhejiang Mingyi Technology Co., Ltd.) Representation Learning
145 351 Cross-domain Semantic Decoupling for Weakly-Supervised Semantic Segmentation Zaiquan Yang (Beihang University)*; Zhanghan Ke (City University of Hong Kong); Gerhard P. Hancke (City University of Hong Kong); Rynson W.H. Lau (City University of Hong Kong) Representation Learning
146 394 Unifying Synergies between Self-supervised Learning and Dynamic Computation Tarun Krishna (DCU)*; Ayush K. Rai (Dublin City University); Eric Arazo (Insight Centre for Data Analytics (DCU)); Paul Albert (Insight Centre for Data Analytics (DCU)); Alexandru F Drimbarean (Xperi); Alan Smeaton (Insight Centre for Data Analytics, Dublin City University); Kevin McGuinness (DCU); Noel O Connor (Home) Representation Learning
147 226 PanoMixSwap – Panorama Mixing via Structural Swapping for Indoor Scene Understanding Yu-Cheng Hsieh (National Tsing Hua University)*; Cheng Sun (National Tsing Hua University); Suraj Dengale (National Tsing Hua University); Min Sun (NTHU) Scene Analysis and Understanding
148 499 Clustered Saliency Prediction Rezvan Sherkati (McGill University)*; James J. Clark (McGill University) Scene Analysis and Understanding
149 77 One-stage Progressive Dichotomous Segmentation Jing Zhu (Samsung Research America)*; Karim Ahmed (Samsung Research America); Wenbo Li (Samsung Research America); Yilin Shen (Samsung Research America); Hongxia Jin (Samsung Research America) Segmentation, grouping and shape analysis
150 81 Towards Robust Few-shot Point Cloud Semantic Segmentation Yating Xu (National University of Singapore)*; Na Zhao (SUTD); Gim Hee Lee (National University of Singapore) Segmentation, grouping and shape analysis
151 815 Text and Click inputs for unambiguous open vocabulary instance segmentation Vighnesh N Birodkar (Google)*; Jonathan Huang (Google); Meera Hahn (Google); Irfan Essa (Georgia Institute of Technology); Nikolai Warner (Georgia Tech) Segmentation, grouping and shape analysis
152 868 Multi-Scale Cross Contrastive Learning for Semi-Supervised Medical Image Segmentation Qianying Liu (University of Glasgow)*; Xiao Gu (Imperial College London); Paul Henderson (University of Glasgow); Fani Deligianni (University of Glasgow) Segmentation, grouping and shape analysis
153 623 Superpixel Positional Encoding to Improve ViT-based Semantic Segmentation Models Roberto Amoroso (University of Modena and Reggio Emilia)*; Matteo Tomei (Prometeia); Lorenzo Baraldi (University of Modena and Reggio Emilia); Rita Cucchiara (Università di Modena e Reggio Emilia) Segmentation, grouping and shape analysis
154 767 Label-guided Real-time Fusion Network forRGB-T Semantic Segmentation Zengrong Lin (Sun Yat-sen University); Baihong Lin (University of Electronic Science and Technology of China)*; Yulan Guo (Sun Yat-sen University) Segmentation, grouping and shape analysis
155 523 SHLS: Superfeatures learned from still images for self-supervised VOS Marcelo M Santos (UFBA)*; Jefferson Fontinele da Silva (University Federal of Maranhão); Luciano Oliveira (UFBA) Segmentation, grouping and shape analysis
156 530 AutoSAM: Adapting SAM to Medical Images by Overloading the Prompt Encoder Tal Shaharbany (Tel Aviv University)*; ‪Aviad Dahan‬‏ (Tel Aviv University); Raja Giryes (Tel Aviv University); Lior Wolf (Tel Aviv University, Israel) Segmentation, grouping and shape analysis
157 719 EyeGuide - From Gaze Data to Instance Segmentation Jacqueline Kockwelp (University of Münster); Joerg Gromoll (CeRA); Joachim Wistuba (Centre of Reproductive Medicine and Andrology); Benjamin Risse (University of Münster)* Segmentation, grouping and shape analysis
158 908 Class-Imbalanced Semi-Supervised Learning with Inverse Auxiliary Classifier Tiansong Jiang (Nanjing University of Science and Technology)*; Sheng Wan (Nanjing university of science and technology); Chen Gong (Nanjing University of Science and Technology) Self-, semi-, meta-, unsupervised learning
159 899 C3: Cross-instance guided Contrastive Clustering Mohammadreza Sadeghi (McGill University); Hadi Hojjati (McGill University); Narges Armanfard (McGill University; Mila - Quebec AI Institute)* Self-, semi-, meta-, unsupervised learning
160 676 BFC-BL: Few-Shot Classification and Segmentation combining Bi-directional Feature Correlation and Boundary constraint Haibiao Yang (Guangdong University of Technology)*; Zeng Bi (Guangdong University of Technology); Pengfei Wei (Guangdong University of Technology); Jianqi Liu (Guangdong University of Technology) Self-, semi-, meta-, unsupervised learning
161 259 Prototype-Aware Contrastive Knowledge Distillation for Few-Shot Anomaly Detection Zhihao Gu (Shanghai Jiao Tong University)*; Taihai Yang (East China Normal University); Lizhuang Ma (Shanghai Jiao Tong University) Self-, semi-, meta-, unsupervised learning
162 837 Domain-Adaptive Semantic Segmentation with Memory-Efficient Cross-Domain Transformers Ruben Mascaro (ETH Zurich)*; Lucas Teixeira (ETH Zurich); Margarita Chli (ETH Zurich) Self-, semi-, meta-, unsupervised learning
163 117 Detect, Augment, Compose, and Adapt: Four Steps for Unsupervised Domain Adaptation in Object Detection Mohamed Lamine Mekhalfi (Fondazione Bruno Kessler)*; Davide Boscaini (Fondazione Bruno Kessler); Fabio Poiesi (Fondazione Bruno Kessler) Self-, semi-, meta-, unsupervised learning
164 215 Hierarchical Quantization Consistency for Fully Unsupervised Image Retrieval Guile Wu (Noah’s Ark Lab); Chao Zhang (Toshiba Europe Limited)*; Stephan Liwicki (Toshiba Europe Limited) Self-, semi-, meta-, unsupervised learning
165 297 Exploring the Limits of Deep Image Clustering using Pretrained Models Nikolas Adaloglou (HHU)*; Felix Michels (HHU); Hamza Kalisch (HHU); Markus Kollmann (HHU) Self-, semi-, meta-, unsupervised learning
166 471 Enhancing Interpretable Object Abstraction via Clustering-based Slot Initialization Ning Gao (Bosch Center for Artificial Intelligence (BCAI))*; Bernard Hohmann (Karlsruhe Institute of Technology); Gerhard Neumann (Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany) Self-, semi-, meta-, unsupervised learning
167 240 StereoFlowGAN: Co-training for Stereo and Flow with Unsupervised Domain Adaptation Zhexiao Xiong (Washington University in St. Louis)*; Feng Qiao (RWTH Aachen University); Yu Zhang (Bastian Solutions); Nathan Jacobs (Washington University in St. Louis) Self-, semi-, meta-, unsupervised learning
168 633 Multi-Target Domain Adaptation with Class-Wise Attribute Transfer in Semantic Segmentation Changjae Kim (DGIST); Seunghun Lee (DGIST)*; Sunghoon Im (DGIST) Transfer, low-shot, continual, long-tail learning
169 858 Weakly-supervised Spatially Grounded Concept Learner for Few-Shot Learning Gaurav Bhatt (The University of British Columbia)*; Deepayan Das (IIT-H); Leonid Sigal (University of British Columbia); Vineeth N Balasubramanian (Indian Institute of Technology, Hyderabad) Transfer, low-shot, continual, long-tail learning
170 12 RestNet: Boosting Cross-Domain Few-Shot Segmentation with Residual Transformation Network Xinyang Huang (Beijing University of Posts and Telecommunications)*; Chuang Zhu (Beijing University of Posts and Telecommunications ); Wenkai Chen (Beijing University of Posts and Telecommunications) Transfer, low-shot, continual, long-tail learning
171 18 Random Word Data Augmentation with CLIP for Zero-Shot Anomaly Detection Masato Tamura (Hitachi America, Ltd.)* Transfer, low-shot, continual, long-tail learning
172 202 Few-Shot Anomaly Detection with Adversarial Loss for Robust Feature Representations Jae Young Lee (KAIST)*; Wonjun Lee (University of Science and Technology ); Jaehyun Choi (KAIST); Yongkwi LEE (ETRI); Young Seog Yoon (Electronics and Telecommunications Research Institute) Transfer, low-shot, continual, long-tail learning
173 330 Fine-grained Few-shot Recognition by Deep Object Parsing Ruizhao Zhu (Boston University)*; Pengkai Zhu (Amazon Web Services); Samarth Mishra (Boston University); Venkatesh Saligrama (Boston University) Transfer, low-shot, continual, long-tail learning
174 762 Novel Regularization via Logit Weight Repulsion for Long-Tailed Classification Taegil Ha (Seoul National University)*; Seulki Park (Seoul National University); Jin Young Choi (Seoul National University) Transfer, low-shot, continual, long-tail learning
175 292 Generating Pseudo-labels Adaptively for Few-shot Model-Agnostic Meta-Learning Guodong Liu (Huazhong University of Science and Technology); Tongling Wang (Huazhong University of Science and Technology); Shuoxi Zhang (Huazhong University of Science and Technology); Kun He (Huazhong University of Science and Technology)* Transfer, low-shot, continual, long-tail learning
176 452 Domain-Aware Augmentations for Unsupervised Online General Continual Learning Nicolas Michel (LIGM)* Transfer, low-shot, continual, long-tail learning
177 534 Dual Feature Augmentation Network for Generalization Zero-shot Learning Lei Xiang (Nanjing University of Information Science and Technology )*; Yuan Zhou (Nanjing University of Information Science and Technology); Haoran Duan (Durham University); Yang Long (Durham University) Transfer, low-shot, continual, long-tail learning
178 264 Predictive Consistency Learning for Long-Tailed Recognition Nan Kang (Key Laboratory of Intelligent Information Processing of Chinese Academy of Sciences (CAS))*; Hong Chang (Chinese Academy of Sciences); Bingpeng MA (University of Chinese Academy of Sciences); Shutao Bai (Institute of Computing Technology, Chinese Academy of Sciences); Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences); Xilin Chen (Institute of Computing Technology, Chinese Academy of Sciences) Transfer, low-shot, continual, long-tail learning
179 542 Temporal-aware Hierarchical Mask Classification for Video Semantic Segmentation Zhaochong An (ETH Zurich); Guolei Sun (ETH Zurich)*; Zongwei WU (Univ. Bourgogne Franche-Comte, France); Hao Tang (ETH Zurich); Luc Van Gool (ETH Zurich) Video analysis and Understanding
180 25 Motion and Context-Aware Audio-Visual Conditioned Video Prediction Yating Xu (National University of Singapore)*; Conghui Hu (National University of Singapore); Gim Hee Lee (National University of Singapore) Vision and audio
181 144 Dual Attention for Audio-Visual Speech Enhancement with Facial Cues Fexiang Wang (ICT, UCAS)*; Shuang Yang (ICT, CAS); Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences); Xilin Chen (Institute of Computing Technology, Chinese Academy of Sciences) Vision and audio
182 367 How Can Contrastive Pre-training Benefit Audio-Visual Segmentation? A Study from Supervised and Zero-shot Perspectives Jiarui Yu (USTC)*; Haoran Li (University of Science and Technology of China); Yanbin Hao (University of Science and Technology of China); Wu Jinmeng (Wuhan Institute of Technology); Tong Xu (University of Science and Technology of China); Shuo Wang (University of Science and Technology of China); Xiangnan He (University of Science and Technology of China) Vision and audio
183 139 Continuous Levels of Detail for Light Field Networks David Li (University of Maryland College Park)*; Brandon Yushan Feng (University of Maryland, College Park); Amitabh Varshney (University of Maryland) Vision and graphics
184 347 SRNet: Striped Pyramid Pooling and Relational Transformer for Retinal Vessel Segmentation Wei Yan (College of Computer Science and Engineering, Northwest Normal University)*; Yun Jiang (College of Computer Science and Engineering, Northwest Normal University); Zequn Zhang (Northwest Normal University ); Yao Yan (College of Computer Science and Engineering, Northwest Normal University); Bingxi Liu (Northwest Normal University) Vision and graphics
185 451 Complex Scene Image Editing by Scene Graph Comprehension Zhongping Zhang (Boston University)*; Huiwen He (Boston University); Bryan Plummer (Boston University); Zhenyu Liao (Kwai Inc); Huayan Wang (Kuaishou Technology) Vision and language
186 314 GOPro: Generate and Optimize Prompts in CLIP using Self-Supervised Learning Mainak Singha (Indian Institute of Technology Bombay)*; Ankit Jha (Indian Institute of Technology Bombay); Biplab Banerjee (Indian Institute of Technology, Bombay) Vision and language
187 182 BDC-Adapter: Brownian Distance Covariance for Better Vision-Language Reasoning Yi Zhang (Southern University of Science and Technology); Ce Zhang (Carnegie Mellon University); Zihan Liao (Southern University of Science and Technology); Yushun Tang (Southern University of Science and Technology); Zhihai He (Southern University of Science and Technology)* Vision and language
188 510 Open-world Text-specifed Object Counting Niki Amini-Naieni (University of Oxford)*; Kiana Amini-Naieni (University of California, Davis); Tengda Han (University of Oxford); Andrew Zisserman (University of Oxford) Vision and language
189 650 Towards Debiasing Frame Length Bias in Text-Video Retrieval via Causal Intervention Burak Satar (Nanyang Technological University)*; Hongyuan Zhu (Institute for Infocomm, Research Agency for Science, Technology and Research (A*STAR) Singapore); Hanwang Zhang (Nanyang Technological University); Joo-Hwee Lim (Institute for Infocomm Research) Vision and language
190 229 Weakly-Supervised Visual-Textual Grounding with Semantic Prior Refinement Davide Rigoni (University of Padua); Luca Parolari (University of Padova); Luciano Serafini (Fondazione Bruno Kessler); Alessandro Sperduti (Università di Padova (IT)); Lamberto Ballan (University of Padova)* Vision and language
191 596 Generating Context-Aware Natural Answers for Questions in 3D Scenes Mohammed Munzer Dwedari (Technical University of Munich)*; Matthias Niessner (Technical University of Munich); Zhenyu Chen (Technical University of Munich) Vision and language
192 378 Neural Feature Filtering for Faster Structure-from-Motion Localisation Alexandros Rotsidis (University of Bath)*; Yuxin Wang (École polytechnique fédérale de Lausanne); Yiorgos Chrysanthou (CYENS Centre of Excellence); Christian Richardt (Meta) Vision and robotics
193 647 Dictionary-Guided Text Recognition for Smart Street Parking Deyang Zhong (University of Washington Tacoma); Jiayu Li (University of Washington ); Wei Cheng (University of Washington); Juhua Hu (University of Washington)* Vision applications and systems
194 300 Contrastive Consistent Representation Distillation Shipeng Fu (Sichuan University )*; Haoran Yang (Sichuan University); Xiaomin Yang (Sichuan University) Vision applications and systems
195 322 3D Structure-guided Network for Tooth Alignment in 2D Photograph Yulong Dou (Shanghaitech)*; Lanzhuju Mei (ShanghaiTech University); Zhiming Cui (HKU); Dinggang Shen (United Imaging Intelligence) Vision applications and systems
196 376 Adapting Generic Features to A Specific Task: A Large Discrepancy Knowledge Distillation for Image Anomaly Detection Chenkai Zhang (Zhejiang University)*; Tianqi Du (Zhejiang University); Yueming Wang (Zhejiang University) Vision applications and systems
197 385 Personalized Fashion Recommendation via Deep Personality Learning Dongmei Mo (The Hong Kong Polytechnic University)*; Xingxing Zou (Laboratory for Artificial Intelligence in Design, The Hong Kong Polytechnic University); Waikeung Wong (Institute of Textiles and Clothing, The Hong Kong Polytechnic University) Vision applications and systems
198 480 Comprehensive Quantitative Quality Assessment of Thermal Cut Sheet Edges using Convolutional Neural Networks Janek Stahl (Fraunhofer IPA)*; Marco Huber (University of Stuttgart); Andreas Frommknecht (Fraunhofer IPA) Vision applications and systems
199 614 FRE: A Fast Method For Anomaly Detection And Segmentation Ibrahima Ndiour (Intel)*; Ergin U Genc (Intel); Nilesh A Ahuja (Intel); Omesh Tickoo (Intel) Vision applications and systems
200 161 Long Story Short: a Summarize-then-Search Method for Prompt-Based Long Video Question Answering Jiwan Chung (Yonsei University)*; Youngjae Yu (Yonsei University) Visual reasoning and logical representation