Day |
Number |
ID |
Title |
Authors |
Topic |
Tuesday |
1 |
194 |
mmPoint: Dense Human
Point Cloud Generation from mmWave |
Qian Xie (University
of Oxford)*; Qianyi Deng (University of Oxford); Ta-Ying Cheng (University of
Oxford); Peijun Zhao (Massachusetts Institute of Technology); Amir Patel
(University of Cape Town); Niki Trigoni (University of Oxford); Andrew
Markham (University of Oxford) |
3D from a single
image and shape-from-x |
2 |
356 |
Lightweight
Self-Supervised Depth Estimation with few-beams LiDAR Data |
Rizhao Fan
(University of Bologna)*; Fabio Tosi (University of Bologna); Matteo Poggi
(University of Bologna); Stefano Mattoccia (University of Bologna) |
3D from a single
image and shape-from-x |
3 |
174 |
Sparse Multi-Object
Render-and-Compare |
Florian Maximilian
Langer (Department of Engineering, University of Cambridge)*; Ignas Budvytis
(Department of Engineering, University of Cambridge); Roberto Cipolla
(University of Cambridge) |
3D from a single
image and shape-from-x |
4 |
90 |
Floorplan Restoration
by Structure Hallucinating Transformer Cascades |
Sepidehsadat Hosseini
(Simon Fraser University)*; Yasutaka Furukawa (Simon Fraser University) |
3D from multi-view
and sensors |
5 |
89 |
Strong Stereo
Features for Self-Supervised Practical Stereo Matching |
Pierre-André
Brousseau (Université de Montréal)*; Sebastien Roy (Universite de Montreal) |
3D from multi-view
and sensors |
6 |
501 |
Temporal Lidar Depth
Completion |
Pietari Kaskela
(NVIDIA)*; Philipp Fischer (NVIDIA); Timo Roman (NVIDIA) |
3D from multi-view
and sensors |
7 |
15 |
The Interstate-24 3D
Dataset: a new benchmark for 3D multi-camera vehicle tracking |
Derek Gloudemans
(Vanderbilt University)*; Daniel Work (Vanderbilt University); Yanbing Wang
(Vanderbilt University); Gracie E Gumm (Vanderbilt University); William
Barbour (Vanderbilt University) |
3D from multi-view
and sensors |
8 |
448 |
Optimal Camera
Configuration for Large-Scale Motion Capture Systems |
Xiongming Dai
(louisiana state university)*; Gerald Baumgartner (Louisiana State
University) |
3D from multi-view
and sensors |
9 |
682 |
ManifoldNeRF:
View-dependent Image Feature Supervision for Few-shot Neural Radiance Fields |
Daiju Kanaoka (Kyushu
Institute of Technology)*; Motoharu Sonogashira (RIKEN); Hakaru Tamukoh
(Kyushu Institute of Technology); Yasutomo Kawanishi (RIKEN) |
3D from multi-view
and sensors |
10 |
741 |
Motion-Bias-Free
Feature-Based SLAM |
Alejandro Fontan
(Queensland University of Technology)*; Michael Milford (ACRV and QUT,
Australia); Javier Civera (Universidad de Zaragoza) |
3D from multi-view
and sensors |
11 |
825 |
RoomNeRF:
Representing Empty Room as Neural Radiance Fields for View Synthesis |
Mangyu Kong (Yonsei
University)*; Seongwon Lee (Yonsei university); Euntai Kim (Yonsei
University) |
3D from multi-view
and sensors |
12 |
304 |
Learning Part Motion
of Articulated Objects Using Spatially Continuous Neural Implicit
Representations |
Yushi Du (Peking
University)*; Ruihai Wu (Peking University); Yan Shen (Peking University);
Hao Dong (Peking University) |
3D Shape modeling and
processing |
13 |
231 |
Propose-and-Complete:
Auto-regressive Semantic Group Generation for Personalized Scene Synthesis |
Shoulong Zhang
(Beihang University); Shuai Li (BeihangUniversity); Xinwei Huang (Beihang
University); Wenchong Xu (Beihang University); Aimin Hao (BeihangUniversity);
HONG QIN (Stony Brook University)* |
3D Shape modeling and
processing |
14 |
306 |
Point Cloud Sampling
Preserving Local Geometry for Surface Reconstruction |
Kohei Matsuzaki (KDDI
Research, Inc.)*; Keisuke Nonaka (KDDI Research, Inc.) |
3D Shape modeling and
processing |
15 |
417 |
Deformation-Guided
Unsupervised Non-Rigid Shape Matching |
Aymen Merrouche
(INRIA)*; Joao Pedro Cova Regateiro (Interdigital); Stefanie Wuhrer (Inria);
Edmond Boyer (Inria) |
3D Shape modeling and
processing |
16 |
820 |
Proposal-based
Temporal Action Localization with Point-level Supervision |
Yuan Yin (Institute
of Industrial Science, The University of Tokyo)*; Yifei Huang (The University
of Tokyo); Ryosuke Furuta (The University of Tokyo); Yoichi Sato (University
of Tokyo) |
Action and event
understanding |
17 |
335 |
Supervised
Contrastive Learning with Identity-Label Embeddings for Facial Action Unit
Recognition |
Tangzheng Lian
(Nottingham Trent Univeristy); David A Adama (Nottingham Trent University);
Pedro Machado (Nottingham Trent University); Doratha E Vinkemeier (NTU)* |
Action and event
understanding |
18 |
332 |
Learning Temporal
Sentence Grounding From Narrated EgoVideos |
Kevin Flanagan
(University of Bristol)*; Dima Damen (University of Bristol); Michael Wray
(University of Bristol) |
Action and event
understanding |
19 |
739 |
Robust Principles:
Architectural Design Principles for Adversarially Robust CNNs |
ShengYun Peng
(Georgia Institute of Technology)*; Weilin Xu (Intel); Cory Cornelius (Intel
Corporation); Matthew Hull (Georgia Institute of Technology); Kevin Li
(Georgia Institute of Technology); Rahul Duggal (Georgia Tech); Mansi Phute
(Georgia Institute of Technology); Jason Martin (Intel Corporation); Duen
Horng Chau (Georgia Institute of Technology) |
Adversarial attack
and defense |
20 |
172 |
Backdoor Attack on
Hash-based Image Retrieval via Clean-label Data Poisoning |
Kuofeng Gao (Tsinghua
University)*; Jiawang Bai (Tsinghua University); Bin Chen (Harbin Institute
of Technology, Shenzhen); Dongxian Wu (the University of Tokyo); Shu-Tao Xia
(Tsinghua University) |
Adversarial attack
and defense |
21 |
406 |
Exploring
Non-additive Randomness on ViT against Query-Based Black-Box Attacks |
Jindong Gu
(University of Oxford)*; Fangyun Wei (Microsoft Research Asia); Philip Torr
(University of Oxford); Han Hu (Microsoft Research Asia) |
Adversarial attack
and defense |
22 |
271 |
Semantic Adversarial
Attacks via Diffusion Models |
Chenan Wang (Drexel
University)*; Jinhao Duan (Drexel University); Chaowei Xiao (ASU); Edward Kim
(Drexel University); Matthew c Stamm (Drexel University); Kaidi Xu (Drexel
University) |
Adversarial attack
and defense |
23 |
296 |
RBFormer: Robust Bias
Can Improve the Adversarial Robust of Transformer-based Structure |
Hao Cheng (The Hong
Kong University of Science and Technology(Guangzhou))*; Jinhao Duan (Drexel
University); Hui Li (Samsung Research and Development Institute China Xi'an);
Lyutianyang Zhang (University of Washington); Jiahang Cao (The Hong Kong
University of Science and Technology (Guangzhou)); Ping Wang (Xi'an Jiaotong
University); Jize Zhang (HKUST); Kaidi Xu (Drexel University); Renjing Xu
(The Hong Kong University of Science and Technology (Guangzhou)) |
Adversarial attack
and defense |
24 |
486 |
ADoPT: LiDAR Spoofing
Attack Detection based on Point-Level Temporal Consistency |
Minkyoung Cho
(University of Michigan)*; Yulong Cao (Nvidia); Zixiang Zhou (University of
Michigan); Zhuoqing Morley Mao (University of Michigan) |
Adversarial attack
and defense |
25 |
620 |
Unifying the Harmonic
Analysis of Adversarial Attacks and Robustness |
Shishira R R Maiya
(University of Maryland)*; Max Ehrlich (NVIDIA); Vatsal Agarwal (University
of Maryland); Ser-Nam Lim (Meta AI); Tom Goldstein (University of Maryland);
Abhinav Shrivastava (University of Maryland) |
Adversarial attack
and defense |
26 |
781 |
Adaptive Adversarial
Norm Space for Efficient Adversarial Training |
Hui Kuurila-Zhang
(University of Oulu)*; Haoyu Chen (University of Oulu); Guoying Zhao
(University of Oulu) |
Adversarial attack
and defense |
27 |
382 |
Fully Quantum
Auto-Encoding of 3D Shapes |
Lakshika Rathi
(Indian Institute of Technology Delhi); Edith Tretschk (Max-Planck-Institut
für Informatik)*; Christian Theobalt (MPI Informatik); Rishabh Dabral (IIT
Bombay); Vladislav Golyanik (MPI for Informatics) |
Brave new ideas |
28 |
677 |
MFSC: Matching by
Few-Shot Classification |
Daniel Shalam
(University of Haifa); Elie Abboud (University of Haifa); Roee Litman (-);
Simon Korman (University of Haifa)* |
Brave new ideas |
29 |
822 |
Differentiable SLAM
Helps Deep Learning-based LiDAR Perception Tasks |
Prashant Kumar
(Indian Institute of Technology, Delhi)*; Dheeraj Vattikonda (McGill
University); Vedang Bhupesh Shenvi Nadkarni (Birla Institute of Technology
and Science, Pilani); Erqun Dong (McGill University); Sabyasachi Sahoo
(Université Laval, Mila) |
Brave new ideas |
30 |
643 |
Color Constancy: How
to Deal with Camera Bias? |
Yi-Tun Lin
(University of East Anglia)*; Bianjiang Yang (Purdue University); Hao Xie
(Meta Platforms, Inc.); Wenbin Wang (Meta); Honghong Peng (Meta); JUN HU
(Apple Inc) |
Computational
Photography |
31 |
348 |
RGB and LUT based
Cross Attention Network for Image Enhancement |
Tengfei Shi (Beihang
University); Chenglizhao Chen (China University of Petroleum (East China))*;
Yuanbo He (State Key Laboratory of Virtual Reality Technology and Systems,
Beihang University); wenfeng song (Beijing Information Science and Technology
University); Aimin Hao (BeihangUniversity) |
Computational
Photography |
32 |
315 |
Generalized Imaging
Augmentation via Linear Optimization of Neurons |
Daoyu Li (Beijing
Institute of Technology); Lu Li (Beijing Institute of Technology); Bin Li
(Beijing University of Posts and Telecommunications); Liheng Bian (Beijing
Institute of Technology)* |
Computational
Photography |
33 |
765 |
Reconstructing
Synthetic Lensless Images in the Low-Data Regime |
Abeer Banerjee
(CSIR-CEERI)*; Himanshu Kumar (CSIR-CEERI); Sumeet Saurav (CSIR-CEERI);
Sanjay Singh (CSIR-CEERI, Pilani ) |
Computational
Photography |
34 |
286 |
Lightweight Image
Super-Resolution with Scale-wise Network |
Xiaole Zhao (School
of Computing and Artificial Intelligence, Southwest Jiaotong University);
Xinkun Wu (School of Computing and Artificial Intelligence, Southwest
Jiaotong University)* |
Computer vision
theory |
35 |
540 |
Sketch-based Video
Object Segmentation: Benchmark and Analysis |
Ruolin Yang (Beijing
University of Posts and Telecommunications)*; Da Li (Samsung); Conghui Hu
(National University of Singapore); Timothy Hospedales (Edinburgh
University); Honggang Zhang (Beijing University of Posts and
Telecommunications); Yi-Zhe Song (University of Surrey) |
Datasets and
Evaluation |
36 |
870 |
Data exploitation:
multi-task learning of object detection and semantic segmentation on
partially annotated data |
Hoàng-Ân Lê (IRISA,
University of South Brittany)*; Minh-Tan Pham (IRISA-UBS) |
Datasets and
Evaluation |
37 |
235 |
What Should be
Balanced in a “Balanced” Dataset? |
Haiyu Wu (University
of Notre Dame)*; Kevin Bowyer (University of Notre Dame) |
Datasets and
Evaluation |
38 |
127 |
SynthBlink and
BlinkFormer: A Synthetic Dataset and Transformer-Based Method for Video Blink
Detection |
Bo Liu (Beihang
University); Yang Xu (Beihang University); Feng Lu (Beihang University)* |
Datasets and
Evaluation |
39 |
743 |
A Comprehensive
Crossroad Camera Dataset to Improve Traffic Safety of Mobility Aid Users |
Ludwig Mohr
(Institute of Computer Graphics and Vision, Graz University of Technology)*;
Nadezda Kirillova (Graz University of Technology); Horst Possegger (Graz
University of Technology); Horst Bischof (Graz University of Technology) |
Datasets and
Evaluation |
40 |
752 |
Learnable Data
Augmentation for One-Shot Unsupervised Domain Adaptation |
Julio Ivan Davila
Carrazco (Istituto Italiano di Tecnologia)*; Pietro Morerio (Istituto
Italiano di Tecnologia); Alessio Del Bue (Istituto Italiano di Tecnologia
(IIT)); Vittorio Murino (Istituto Italiano di Tecnologia) |
Deep learning
architectures and techniques |
41 |
660 |
G2N2: Lightweight
Event Stream Classification with GRU Graph Neural Networks |
Thomas Mesquida (CEA
LIST)*; Manon Dampfhoffer (SPINTEC University Grenoble Alpes); Thomas Dalgaty
(CEA List); Pascal Vivet (CEA-LIST); Amos Sironi (PROPHESEE); Christoph Posch
(PROPHESEE) |
Deep learning
architectures and techniques |
42 |
709 |
Momentum Adapt:
Robust Unsupervised Adaptation for Improving Temporal Consistency in Video
Semantic Segmentation During Test-Time |
Amirhossein
Hassankhani (Tampere University)*; Hamed Rezazadegan Tavakoli (Nokia
Technologies); Esa Rahtu (Tampere University) |
Deep learning
architectures and techniques |
43 |
123 |
Knowledge
Distillation Layer that Lets the Student Decide |
Ada Gorgun (Middle
East Technical University)*; Yeti Z. Gurbuz (Tecnische Universitat Berlin);
Aydin Alatan (Middle East Technical University, Turkey) |
Deep learning
architectures and techniques |
44 |
237 |
Spatio-Temporal
MLP-Graph Network for 3D Human Pose Estimation |
Md. Tanvir Hassan
(Concordia University); Abdessamad Ben Hamza (Concordia University)* |
Deep learning
architectures and techniques |
45 |
360 |
FLRKD: Relational
Knowledge Distillation Based on Channel-wise Feature Quality Assessment |
Zeyu An (University
of Electronic Science and Technology of China)*; Changjian Deng (University
of Electronic Science and Technology of China); Wanli Dang (University of
Electronic Science and Technology of China;The Second Research Institute of
the Civil Aviation Administration of China); Zhicheng Dong (Tibet
university); 谦 罗 (中国民用航空总局第二研究所); Jian Cheng (University of Electronic
Science and Technology of China) |
Deep learning
architectures and techniques |
46 |
792 |
Budding Ensemble
Architecture: Revisiting anchor-based object detection DNN |
Qutub Syed (INTEL
LABS)*; Neslihan Kose Cihangir (Intel Deutschland GmbH); Rafael Rosales
(Intel); Michael Paulitsch (Intel); Korbinian Hagn (Intel); Florian R
Geissler (Intel); Yang Peng (Intel); Gereon Hinz (STTech GmbH); Alois C.
Knoll (Robotics and Embedded Systems) |
Deep learning
architectures and techniques |
47 |
893 |
VADOR: Real World
Video Anomaly Detection with Object Relations and Action |
Halil İbrahim Öztürk
(Togg)*; Ahmet Burak Can (Hacettepe University) |
Deep learning
architectures and techniques |
48 |
911 |
Masked Attention
ConvNeXt Unet with Multi-Synthetic Dynamic Weighting for Anomaly Detection
and Localization |
SHIH CHIH LIN
(National Tsing Hua University)*; Ho Weng Lee (National Tsing Hua
University); Yu-Shuan Hsieh (National Tsing Hua University); Cheng Yu Ho
(National Tsing Hua University); Shang-Hong Lai (National Tsing Hua
University) |
Deep learning
architectures and techniques |
49 |
204 |
Cardiac Landmark
Detection using Generative Adversarial Networks from Cardiac MR Images |
Aparna Kanakatte
(TCS)*; DIVYA M BHATIA (TCS); Pavan Kumar Reddy K (TCS Research);
Jayavardhana Gubbi (TCS Research); Avik Ghose (TCS) |
Deep learning
architectures and techniques |
50 |
514 |
DFFG: Fast Gradient
Iteration for Data-free Quantization |
huixing leng (Beihang
University); shuangkang fang (megvii,buaa); Yufeng Wang (Beihang
University)*; Zehao ZHANG (beihang university); Qi Dacheng (Beijing Jiaotong
University); Wenrui Ding (Beihang University) |
Deep learning
architectures and techniques |
51 |
522 |
Train ViT on Small
Dataset With Translation Perceptibility |
CHEN HUAN (Institute
of Computing Technology)*; Ping Yao (Institute of Computing Technology,
Chinese Academy of Sciences ); WENTAO WEI (Southeast University) |
Deep learning
architectures and techniques |
52 |
665 |
Distillation for
High-Quality Knowledge Extraction via Explainable Oracle Approach |
MyungHak Lee (Kookmin
University)*; Wooseong Syz Cho (Kookmin University); Sungsik Kim (Kookmin
University); Jinkyu Kim (Korea University); Jaekoo Lee (Kookmin University) |
Deep learning
architectures and techniques |
53 |
846 |
Topology-Preserving
Hard Pixel Mining for Tubular Structure Segmentation |
Guoqing Zhang
(Tsinghua-Berkeley Shenzhen Institute, Tsinghua University)*; Caixia Dong
(The Second Affilated Hospital of Xi'an Jiaotong University); Yang Li
(Tsinghua-Berkeley Shenzhen Institute, Tsinghua University) |
Deep learning
architectures and techniques |
54 |
854 |
A Forward-backward
Learning strategy for CNNs via Separation Index Maximizing at the First
Convolutional Layer |
Ali Karimi
(University of Tehran); Ahmad Kalhor (University of Tehran)*; Mona Ahmadian
(University of Surrey) |
Deep learning
architectures and techniques |
55 |
214 |
Understanding
Gaussian Attention Bias of Vision Transformers Using Effective Receptive
Fields |
Bum Jun Kim
(POSTECH); Hyeyeon Choi (POSTECH); Hyeonah Jang (POSTECH); Sang Woo Kim
(POSTECH)* |
Deep learning
architectures and techniques |
56 |
295 |
LOCATE:
Self-supervised Object Discovery via Flow-guided Graph-cut and Bootstrapped
Self-training |
Silky Singh (Adobe
Systems)*; Shripad V Deshmukh (Adobe); Mausoom Sarkar (Adobe); Balaji
Krishnamurthy () |
Deep learning
architectures and techniques |
57 |
562 |
Fiducial Focus
Augmentation for Facial Landmark Detection |
Purbayan Kar (Sony
Research India); Vishal M Chudasama (Sony Research India); Naoyuki Onoe
(Sony); Pankaj Wasnik (Sony Research India)*; Vineeth Balasubramanian (Indian
Institute of Technology Hyderabad) |
Deep learning
architectures and techniques |
58 |
268 |
Lips-SpecFormer:
Non-Linear Interpolable Transformer for Spectral Reconstruction using
Adjacent Channel Coupling |
Abhishek Kumar Sinha
(Indian Space Research Organization)*; Manthira Moorthi S (ISRO) |
Deep learning
architectures and techniques |
59 |
521 |
Selective Scene Text
Removal |
Hayato Mitani (Kyushu
University)*; Akisato Kimura (NTT Corporation); Seiichi Uchida (Kyushu
University) |
Document analysis and
understanding |
60 |
7 |
HWD: A Novel
Evaluation Score for Styled Handwritten Text Generation |
Vittorio Pippi
(University of Modena and Reggio Emilia)*; Fabio Quattrini (University of
Modena and Reggio Emilia); Silvia Cascianelli (Università di Modena e Reggio
Emilia); Rita Cucchiara (Università di Modena e Reggio Emilia) |
Document analysis and
understanding |
61 |
511 |
McQueen: Mixed
Precision Quantization of Early Exit Networks |
Utkarsh Saxena
(Purdue University)*; Kaushik Roy (Purdue Uniiversity) |
Efficient and
scalable vision |
62 |
345 |
Region-aware
Knowledge Distillation for Efficient Image-to-Image Translation |
Linfeng Zhang
(Tsinghua University )*; Xin Chen (Intel Corp.); Runpei Dong (Xi'an Jiaotong
University); Kaisheng Ma (Tsinghua University ) |
Efficient and
scalable vision |
63 |
744 |
CoordGate:
Efficiently Computing Spatially-Varying Convolutions in Convolutional Neural
Networks |
Sunny Howard
(University of Oxford)*; Peter Norreys (University of Oxford); Andreas Döpp
(LMU Munich) |
Efficient and
scalable vision |
64 |
307 |
RUPQ: Improving
low-bit quantization by equalizing relative updates of quantization
parameters |
Valentin Buchnev
(Huawei Technologies Co. Ltd.)*; Jiao He (huawei company); Fengyu Sun
(Huawei); Ivan Koryakovskiy (Huawei Technologies Co., Ltd.) |
Efficient and
scalable vision |
65 |
889 |
DeepliteRT: Computer
Vision at the Edge |
Saad Ashfaq
(Deeplite)*; Alexander Hoffman (McGill University); SAPTARSHI MITRA (Deeplite
Inc.); Ehsan Saboori (Deeplite Inc.); Sudhakar Sah (Deeplite Inc);
MohammadHossein AskariHemmat (Polytechnique Montreal) |
Efficient and
scalable vision |
Wednesday |
66 |
622 |
Teaching AI to Teach:
Leveraging Limited Human Salience Data Into Unlimited Saliency-Based Training |
Colton R Crum
(University of Notre Dame)*; Aidan Boyd (University of Notre Dame); Kevin
Bowyer (University of Notre Dame); Adam Czajka (University of Notre Dame) |
Biometrics |
67 |
498 |
CERiL: Continuous
Event-based Reinforcement Learning |
Celyn Walters
(University of Surrey); Simon Hadfield (University of Surrey)* |
Embodied vision:
Active agents; simulation |
68 |
703 |
Foveation in the Era
of Deep Learning |
Gerardo
Aragon-Camarasa (University of Glasgow); George W Killick (University of
Glasgow)*; Paul Henderson (University of Glasgow); Jan Paul Siebert
(University of Glasgow) |
Embodied vision:
Active agents; simulation |
69 |
828 |
SCAAT: Improving
Neural Network Interpretability via Saliency Constrained Adaptive Adversarial
Training |
Rui Xu (Peking
University); Wenkang Qin (Peking University); Peixiang Huang (Peking
University); Hao Wang (National Institutes for Food and Drug Control); Lin
Luo (Peking University)* |
Explainable AI |
70 |
188 |
Diverse Explanations
for Object Detectors with Nesterov-Accelerated iGOS++ |
Mingqi Jiang (Oregon
State University)*; Saeed Khorram (Oregon State University); Li Fuxin (Oregon
State University) |
Explainable AI |
71 |
40 |
Learning a Pedestrian
Social Behavior Dictionary |
Faith M Johnson
(Rutgers University)*; Kristin Dana (Rutgers University) |
Explainable AI |
72 |
207 |
Embedding Human
Knowledge into Spatio-Temproal Attention Branch Network in Video Recognition
via Temporal attention |
Saki Noguchi (Chubu
University)*; Yuzhi Shi ( Chubu University); Tsubasa Hirakawa (Chubu
University); Takayoshi Yamashita (Chubu University); Hironobu Fujiyoshi
(Chubu University) |
Explainable AI |
73 |
761 |
Laughing Matters:
Introducing Audio-Driven Laughing-Face Generation with Diffusion Models |
Antoni Bigata
Casademunt (Imperial College London)*; Rodrigo Mira (Imperial College
London); Nikita Drobyshev (Imperial College London); Konstantinos Vougioukas
(Imperial College London); Stavros Petridis (Imperial College London); Maja
Pantic (Facebook / Imperial College London ) |
Faces and gestures |
74 |
146 |
Learning Separable
Hidden Unit Contributions for Speaker-Adaptive Visual Speech Recognition |
Songtao Luo
(Institute of Computing Technology, Chinese Academy of Sciences)*; Shuang
Yang (ICT, CAS); Shiguang Shan (Institute of Computing Technology, Chinese
Academy of Sciences); Xilin Chen (Institute of Computing Technology, Chinese
Academy of Sciences) |
Faces and gestures |
75 |
190 |
UniLip: Learning
Visual-Textual Mapping with Uni-Modal Data for Lip Reading |
Bingquan Xia
(Institute of Computing Technology, Chinese Academy of Sciences)*; Shuang
Yang (ICT, CAS); Shiguang Shan (Institute of Computing Technology, Chinese
Academy of Sciences); Xilin Chen (Institute of Computing Technology, Chinese
Academy of Sciences) |
Faces and gestures |
76 |
470 |
KFC: Kinship
Verification with Fair Contrastive loss and Multi-Task Learning |
Jia Luo Peng
(National Tsing Hua University)*; Keng Wei Chang (National Tsing Hua
University); Shang-Hong Lai (National Tsing Hua University) |
Faces and gestures |
77 |
98 |
Prompting
Visual-Language Models for Dynamic Facial Expression Recognition |
Zengqun Zhao (Queen
Mary University of London)*; Ioannis Patras (Queen Mary University of London) |
Faces and gestures |
78 |
287 |
EventFormer: AU Event
Transformer for Facial Action Unit Event Detection |
Yingjie Chen (Peking
University)*; Jiarui Zhang (Peking University); Tao Wang (Peking University);
Yun Liang (Peking University) |
Faces and gestures |
79 |
230 |
De-identification of
facial videos while preserving remote physiological utility |
Marko Radisa Savic
(University of Oulu)*; Guoying Zhao (University of Oulu) |
Fairness, privacy,
ethics, social-good, transparency, accountability in vision |
80 |
629 |
Biased Attention: Do
Vision Transformers Amplify Gender Bias More than Convolutional Neural
Networks? |
Abhishek Mandal
(Dublin City University)*; Susan Leavy (University College Dublin); Suzanne
Little (Dublin City University, Ireland) |
Fairness, privacy,
ethics, social-good, transparency, accountability in vision |
81 |
799 |
Discriminative
Adversarial Privacy: Balancing Accuracy and Membership Privacy in Neural
Networks |
Eugenio Lomurno
(Politecnico di Milano)*; Alberto Archetti (Politecnico di Milano); Francesca
Ausonio (Politecnico di Milano); Matteo Matteucci (Politecnico di Milano) |
Fairness, privacy,
ethics, social-good, transparency, accountability in vision |
82 |
143 |
Sparse and
Privacy-enhanced Representation for Human Pose Estimation |
Ting-Ying Lin
(National Tsing Hua University)*; Lin-Yung Hsieh (National Tsing Hua
University); Fu-En Wang (National Tsing Hua University); Wen-Shen Wuen
(Novatek Microelectronics Corp.); Min Sun (NTHU) |
Human pose/shape
estimation |
83 |
763 |
BoIR: Box-Supervised
Instance Representation for Multi Person Pose Estimation |
Uyoung Jeong (Ulsan
National Institute of Science and Technology)*; Seungryul Baek (UNIST); Hyung
Jin Chang (University of Birmingham); Kwang In Kim (POSTECH) |
Human pose/shape
estimation |
84 |
664 |
Stream-based Active
Learning by Exploiting Temporal Properties in Perception with Temporal
Predicted Loss |
Sebastian Schmidt
(BMW)*; Stephan Günnemann (Technical University of Munich) |
Human-in-the-loop
computer vision |
85 |
798 |
Cascade Sparse
Feature Propagation Network for Interactive Segmentation |
Chuyu Zhang (PLUS
Lab, Shanghaitech University)*; Hui Ren (ShanghaiTech); ChuanYang Hu (PLUS
Lab, Shanghaitech University); Yongfei Liu (ShanghaiTech); Xuming He
(ShanghaiTech University) |
Human-in-the-loop
computer vision |
86 |
337 |
Active Learning for
Fine-Grained Sketch-Based Image Retrieval |
Himanshu Thakur
(Carnegie Mellon University); Soumitri Chattopadhyay (Jadavpur University)* |
Human-in-the-loop
computer vision |
87 |
718 |
Self-supervised
Adversarial Training for Robust Face Forgery Detection |
Yueying Gao
(Communication University of China)*; Weiguo Lin (Communication University of
China); junfeng xu (Communication University of China); Wanshan Xu
(Communication University of China); Peibin Chen (Communication University of
China) |
Image and video
forensics |
88 |
379 |
Test-Time Adaptation
for Robust Face Anti-Spoofing |
Pei-Kai Huang
(National Tsing Hua University)*; Chen-Yu Lu (National Tsing Hua University);
Shu-Jung Chang (National Tsing Hua University); Jun-Xiong Chong (National
Tsing Hua University); Chiou-Ting Hsu (National Tsing Hua University) |
Image and video
forensics |
89 |
659 |
Open Set Synthetic
Image Source Attribution |
Shengbang Fang
(Drexel University)*; Tai D Nguyen (Drexel University); Matthew c Stamm
(Drexel University) |
Image and video
forensics |
90 |
595 |
Face Aging via
Diffusion-based Editing |
Xiangyi Chen (Télécom
Paris, Shanghai Jiao Tong University)*; Stéphane Lathuilière (Telecom-Paris) |
Image and Video
Synthesis |
91 |
236 |
Temporal-controlled
Frame Swap for Generating High-Fidelity Stereo Driving Data for Autonomy
Analysis |
Yedi Luo
(Northeastern University); Xiangyu Bai (Northeastern University); Jiang Le
(Northeastern University ); Aniket Gupta (Northeastern University); Eric C
Mortin (US Army DEVCOM Analysis Center); Hanumant Singh (Northeastern
University); Sarah Ostadabbas (Northeastern University)* |
Image and Video
Synthesis |
92 |
16 |
VETIM: Expanding the
Vocabulary of Text-to-Image Models only with Text |
Martin Nicolas
Everaert (EPFL)*; Radhakrishna Achanta (EPFL); Marco Bocchio (Largo.ai); Sami
Arpa (Largo.ai); Sabine Süsstrunk (EPFL) |
Image and Video
Synthesis |
93 |
258 |
A Structure-Guided
Diffusion Model for Large-Hole Image Completion |
Daichi Horita (The
University of Tokyo)*; Jiaolong Yang (Microsoft Research); Dong Chen
(Microsoft Research Asia); Yuki Koyama (National Institute of Advanced
Industrial Science and Technology (AIST)); Kiyoharu Aizawa (The University of
Tokyo); Nicu Sebe (University of Trento) |
Image and Video
Synthesis |
94 |
103 |
Video Infilling with
Rich Motion Prior |
Xinyu Hou (Nanyang
Technological University)*; Liming Jiang (Nanyang Technological University);
Rui Shao (Harbin Institute of Technology (Shenzhen)); Chen Change Loy
(Nanyang Technological University) |
Image and Video
Synthesis |
95 |
274 |
Frequency-consistent
Optimization for Image Enhancement Networks |
Bing Li (University
of Science and Technology of China)*; Naishan Zheng (University of Science
and Technology of China); Qi Zhu (University of Science and Technology of
China); Jie Huang (University of Science and Technology of China); Feng Zhao
(University of Science and Technology of China) |
Low-level and
Physics-based Vision |
96 |
46 |
Joint Low-light
Enhancement and Super Resolution with Image Underexposure Level Guidance |
Mingjie Xu (Beihang
University); Chaoqun Zhuang (Beihang University); Feifan Lv (Beihang
University); Feng Lu (Beihang University)* |
Low-level and
Physics-based Vision |
97 |
674 |
Estimating Absorption
Coefficient from a Single Image via Entropy Minimization |
Junya Katahira
(Kyushu Institute of Technology); Ryo Kawahara (Kyushu Institute of
Technology); Takahiro Okabe (Kyushu Institute of Technology)* |
Low-level and
Physics-based Vision |
98 |
149 |
Five A+ Network: You
Only Need 9K Parameters for Underwater Image Enhancement |
JingXia Jiang (jimei
university); Tian Ye (The Hong Kong University of Science and Technology
(Guangzhou))*; Sixiang Chen (The Hong Kong University of Science and
Technology (Guangzhou)); Erkang Chen (Jimei University); Yun Liu (Southwest
University); Shi Jun (XinJiang University); Jinbin Bai (Nanjing University);
Wenhao Chai (University of Washington) |
Low-level and
Physics-based Vision |
99 |
775 |
Towards Clip-Free
Quantized Super-Resolution Networks: How to Tame Representative Images |
Alperen Kalay
(Aselsan Research)*; Bahri Batuhan Bilecen (Aselsan Research); Mustafa
Ayazoglu (Aselsan Research) |
Low-level and
Physics-based Vision |
100 |
358 |
RawSeg: Grid Spatial
and Spectral Attended Semantic Segmentation Based on Raw Bayer Images |
Guoyu Lu (University
of Georgia)* |
Low-level and
Physics-based Vision |
101 |
635 |
Log RGB Images
Provide Invariance to Intensity and Color Balance Variation for Convolutional
Networks |
Bruce A Maxwell
(Northeastern University)*; Sumegha Singhania (Northeastern University);
Heather Fryling (Northeastern University); Haonan Sun (Northeastern
University) |
Low-level and
Physics-based Vision |
102 |
283 |
MG-MLP: Multi-gated
MLP for Restoring Images from Spatially Variant Degradations |
Jaihyun Koh (Samsung
Display)*; Jaihyun Lew (Seoul National University); Jangho Lee (Incheon
National University); Sungroh Yoon (Seoul National University) |
Low-level and
Physics-based Vision |
103 |
538 |
Learning Disentangled
Representations for Environment Inference in Out-of-distribution
Generalization |
Dongqi Li (Beijing
Jiaotong University); Zhu Teng (Beijing Jiaotong University); Li Qirui
(AFCtech); Wang Ziyin (AFCtech); Baopeng Zhang (BJTU)*; Jianping Fan (Lenovo) |
Machine learning
(other than deep learning) |
104 |
750 |
A2V: A
Semi-Supervised Domain Adaptation Framework for Brain Vessel Segmentation via
Two-Phase Training Angiography-to-Venography Translation |
Francesco Galati
(EURECOM)*; Daniele Falcetta (EURECOM); Rosa Cortese (University of Siena);
Barbara Casolla (CHU Nice); Ferran Prados (University College London); Ninon
Burgos (CNRS - Paris Brain Institute); Maria A. Zuluaga (EURECOM) |
Medical and
biological vision; cell microscopy |
105 |
409 |
AGMDT: Virtual
Staining of Renal Histology Images with Adjacency-Guided Multi-Domain
Transfer |
Tao Ma (Peking
University)*; Chao Zhang (Peking University); MIN LU (Peking University
Health Science Center); Lin Luo (Peking University) |
Medical and
biological vision; cell microscopy |
106 |
84 |
Spatial and Planar
Consistency for Semi-Supervised Volumetric Medical Image Segmentation |
Yanfeng Zhou
(Institute of Automation, Chinese Academy of Sciences); yiming huang
(nstitute of Automation,Chinese Academy of Sciences); Ge Yang (National
Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy
of Sciences)* |
Medical and
biological vision; cell microscopy |
107 |
699 |
Variational
Autoencoders for Feature Exploration and Malignancy Prediction of Lung
Lesions |
Ben Keel (University
of Leeds)*; Samuel D. Relton (University of Leeds); Aaron Quyn (University of
Leeds); David Jayne (University of Leeds) |
Medical and
biological vision; cell microscopy |
108 |
99 |
ReSynthDetect: A
Fundus Anomaly Detection Network with Reconstruction and Synthetic Features |
Jingqi Niu (Shanghai
Jiaotong University)*; Qinji Yu (Shanghai Jiao Tong University); Shiwen Dong
(Shanghai Jiao Tong University); Zilong Wang (Voxelcloud); Kang Dang
(Voxelcloud Inc); xiaowei ding (Shanghai Jiao Tong University) |
Medical and
biological vision; cell microscopy |
109 |
411 |
LACFormer: Toward
accurate and efficient polyp segmentation |
Quan Van Nguyen
(R&D Lab, Sun Asterisk Inc Vietnam)*; Mai Nguyen ( R&D Lab, Sun
Asterisk Inc Vietnam); Thanh Tung Nguyen (Sun Asterisk Vietnam); Huy Trịnh
Quang (Sun-asterisk); Toan Pham Van (R&D Lab, Sun Asterisk Inc Vietnam);
Linh Bao Doan (Sun* Inc.) |
Medical and
biological vision; cell microscopy |
110 |
789 |
Multi-Stain
Self-Attention Graph Multiple Instance Learning Pipeline for Histopathology
Whole Slide Images |
Amaya Gallagher-Syed
(Queen Mary University of London)*; Luca Rossi (The Hong Kong Polytechnic
University); Felice Rivellese (Queen Mary University of London); Costantino
Pitzalis (Queen Mary University of London); Myles Lewis (Queen Mary
University of London); Michael Barnes (Queen Mary University of London);
Gregory Slabaugh (Queen Mary University of London) |
Medical and
biological vision; cell microscopy |
111 |
462 |
Enhance Regional Wall
Segmentation by Style Transfer for Regional Wall Motion Assessment |
Kaikai Liu (Northwest
A&F University); Yiyu Shi (University of Notre Dame); Jian Zhuang
(Guangdong Provincial People's Hospital); Meiping Huang (Guangdong Provincial
People's Hospital); Hongwen Fei (Guangdong Provincial People's Hospital);
Boyang Li (Meta); Jin Hong (Guangdong Provincial People's Hospital); Qing Lu
(University of Notre Dame); Erlei Zhang (Northwest A&F University);
Xiaowei Xu (Guangdong Provincial People's Hospital)* |
Medical and
biological vision; cell microscopy |
112 |
467 |
Cross-Modal Attention
for Accurate Pedestrian Trajectory Prediction |
Mayssa ZAIER (IMT
NORD EUROPE)*; Hazem Wannous (University of Lille); Hassen Drira (University
of Strasbourg); Jacques boonaert (imt lille douai) |
Motion estimation and
tracking |
113 |
603 |
READMem: Robust
Embedding Association for a Diverse Memory in Unconstrained Video Object
Segmentation |
Stephane Vujasinovic
(Fraunhofer IOSB)*; Sebastian W Bullinger (Fraunhofer IOSB); Stefan Becker
(Fraunhofer IOSB); Norbert Scherer-Negenborn (Fraunhofer IOSB); Michael Arens
(Fraunhofer IOSB); Rainer Stiefelhagen (Karlsruhe Institute of Technology) |
Motion estimation and
tracking |
114 |
441 |
EgoFlowNet: Non-Rigid
Scene Flow from Point Clouds with Ego-Motion Support |
Ramy Battrawy
(DFKI)*; René Schuster (DFKI); Didier Stricker (DFKI) |
Motion estimation and
tracking |
115 |
813 |
Learning Tri-modal
Embeddings for Zero-Shot Soundscape Mapping |
Subash Khanal
(Washington University in Saint Louis)*; Srikumar Sastry (Washington
University in St. Louis); Aayush Dhakal (Washington University in St Louis);
Nathan Jacobs (Washington University in St. Louis) |
Multimodal learning |
116 |
624 |
Text-to-Motion
Synthesis using Discrete Diffusion Model |
Ankur Chemburkar (USC
Institute for Creative Technologies)*; Shuhong Lu (USC Institute for Creative
Technologies); Andrew Feng (USC Institute for Creative Technologies) |
Multimodal learning |
117 |
460 |
AMA: Adaptive Memory
Augmentation for Enhancing Image Captioning |
Shuang Cheng
(Institute of Computing Technology, Chinese Academy of Sciences)*; Jian Ye
(Institute of Computing Technology, CAS) |
Multimodal learning |
118 |
326 |
X-PDNet: Accurate
Joint Plane Instance Segmentation and Monocular Depth Estimation with
Cross-Task Attention and Boundary Correction |
Duc Cao Dinh
(Computer Vision Lab, Hanyang University)*; Jongwoo Lim (Hanyang University) |
Multimodal learning |
119 |
381 |
Zero-shot Composed
Text-Image Retrieval |
Yikun Liu (Beijing
University of Posts and Telecommunications)*; Jiangchao Yao (Cooperative
Medianet Innovation Center, Shang hai Jiao Tong University); Yan-Feng Wang
(Cooperative medianet innovation center of Shanghai Jiao Tong University); Ya
Zhang (Cooperative Medianet Innovation Center, Shang hai Jiao Tong
University); Weidi Xie (Shanghai Jiao Tong University) |
Multimodal learning |
120 |
666 |
E2SAM: A Pipeline for
Efficiently Extending SAM's Capability on Cross-Modality Data via Knowledge
Inheritance |
Su sundingkai
(Beijing University of Posts and Telecommunications); Mengqiu Xu (Beijing
University of Posts and Telecommunications); Kaixin Chen (Beijing University
of Posts and Telecommunications); Ming Wu (Beijing University of Posts and
Telecommunications)*; Chuang Zhang (Beijing University of Posts and
Telecommunications) |
Multimodal learning |
121 |
478 |
Conditional
Generation from Pre-Trained Diffusion Models using Denoiser Representations |
Alexandros Graikos
(Stony Brook University)*; Srikar Yellapragada (Stony Brook University);
Dimitris Samaras (Stony Brook University) |
Neural generative
models |
122 |
715 |
Bridging the Gap:
Enhancing the Utility of Synthetic Data via Post-Processing Techniques |
Eugenio Lomurno
(Politecnico di Milano)*; Andrea Lampis (Politecnico di Milano); Matteo
Matteucci (Politecnico di Milano) |
Neural generative
models |
123 |
787 |
TD-GEM: Text-Driven
Garment Editing Mapper |
Reza Dadfar (KTH);
Sanaz Sabzevari (KTH University)*; Marten Bjorkman (KTH); Danica Kragic (KTH
Royal Institute of Technology) |
Neural generative
models |
124 |
243 |
Class-Continuous
Conditional Generative Neural Radiance Field |
Jiwook Kim (Chung-Ang
University)*; Minhyeok Lee (Chung-Ang University) |
Neural generative
models |
125 |
22 |
Locality-Aware
Hyperspectral Classification |
Fangqin Zhou
(Technology University of Eindhoven); Mert Kilickaya (Eindhoven University of
Technology)*; Joaquin Vanschoren (Eindhoven University of Technology) |
Photogrammetry and
remote sensing |
126 |
4 |
Instance Mask Growing
on Leaf |
Chuang Yang
(Northwestern Polytechnical University); Haozhao Ma (Northwestern
Polytechnical University); Qi Wang (Northwestern Polytechnical University)* |
Recognition:
Categorization and Instance recognition |
127 |
135 |
Infinite Class Mixup |
Thomas Mensink
(Google Research); Pascal Mettes (University of Amsterdam)* |
Recognition:
Categorization and Instance recognition |
128 |
375 |
Building A Mobile
Text Recognizer via Truncated SVD-based Knowledge Distillation-Guided NAS |
Weifeng Lin (South
China University of Technology); Canyu Xie (South China University of
Technology); Dezhi Peng (South China University of Technology); Jiapeng Wang
(South China University of Technology); Lianwen Jin (South China University
of Technology)*; Wei Ding (Alibaba Group); Cong Yao (Alibaba DAMO Academy);
Mengchao He (DAMO Academy, Alibaba Group) |
Recognition:
Categorization and Instance recognition |
129 |
119 |
Integrating Transient
and Long-term Physical States for Depression Intelligent Diagnosis |
Ke Wu (Beihang
University); Han Jiang (Beihang University)*; Li Kuang (Beihang University);
Yixuan Wang (Beihang University); Huaiqian Ye (Beihang University); Yuanbo He
(State Key Laboratory of Virtual Reality Technology and Systems, Beihang
University) |
Recognition:
Categorization and Instance recognition |
130 |
320 |
Learning Unified
Representations for Multi-Resolution Face Recognition |
Hulingxiao He (School
of Automation,Beijing Institute of Technology); Wu Yuan (School of Computer
Science,Beijing Institute of Technology)*; Yidian Huang (Beijing Institute of
Technology); Shilong Zhao (Beijing Institute of Technology); Wen Yuan (State
Key Laboratory of Resources and Environmental Information System, Institute
of Geographic Sciences and Natural Resources Research, CAS); Hanqing Li
(University of the Chinese Academy of Sciences) |
Recognition:
Categorization and Instance recognition |
131 |
102 |
ReCoT: Regularized
Co-Training for Facial Action Unit Recognition with Noisy Labels |
Yifan Li (Michigan
State University); Hu Han (Institute of Computing Technology, Chinese Academy
of Sciences)*; Shiguang Shan (Institute of Computing Technology, Chinese
Academy of Sciences); zhilong ji (Tomorrow Advancing Life); Jinfeng Bai
(Tomorrow Advance Life); Xilin Chen (Institute of Computing Technology,
Chinese Academy of Sciences) |
Recognition:
Categorization and Instance recognition |
Thursday |
132 |
272 |
SMPLitex: A
Generative Model and Dataset for 3D Human Texture Estimation from Single
Image |
Dan Casas
(Universidad Rey Juan Carlos)*; Marc Comino Trinidad (Universidad Rey Juan
Carlos) |
3D from a single
image and shape-from-x |
133 |
800 |
Mobile Vision
Transformer-based Visual Object Tracking |
Goutam Yelluru Gopal
(Concordia University)*; Maria Amer (Concordia University) |
Object pose
estimation and tracking |
134 |
444 |
Semi-Supervised
Domain Generalization for Detection via Language-Guided Feature Alignment |
Sina Malakouti
(University of Pittsburgh)*; Adriana Kovashka (University of Pittsburgh) |
Recognition:
Detection |
135 |
94 |
Likelihood-based
Out-of-Distribution Detection with Denoising Diffusion Probabilistic Models |
Joseph S Goodier
(University of Bath)*; Neill Campbell (University of Bath) |
Recognition:
Detection |
136 |
323 |
Point-to-RBox Network
for Oriented Object Detection via Single Point Supervision |
Yucheng Wang (WuHan
University)*; Chu He (Wuhan University); Xi Chen (Wuhan university) |
Recognition:
Detection |
137 |
310 |
Widely Applicable
Strong Baseline for Sports Ball Detection and Tracking |
Shuhei Tarashima (NTT
Communications Corporation)*; Norio Tagawa (Tokyo Metropolitan University);
Muhammad Abdul Haq (Tokyo Metropolitan University); Wang Yushan (Tokyo
Metropolitan University) |
Recognition:
Detection |
138 |
93 |
Open-Vocabulary
Object Detection with Meta Prompt Representation and Instance Contrastive
Optimization |
Zhao Wang (The
Chinese University of Hong Kong)*; Aoxue Li (Noah's Ark Lab); Fengwei Zhou
(Huawei Noah's Ark Lab); Zhenguo Li (Huawei Noah's Ark Lab); DOU QI (The
Chinese University of Hong Kong) |
Recognition:
Detection |
139 |
707 |
SWIN-RIND: Edge
Detection for Reflectance, Illumination, Normal and Depth Discontinuity with
Swin Transformer |
LUN MIAO (The
University of Tokyo)*; Takeshi Oishi (The University of Tokyo); Ryoichi
Ishikawa (The university of Tokyo) |
Recognition:
Detection |
140 |
33 |
Scale Adaptive
Network for Partial Person Re-identification: Counteracting Scale Variance |
HongYu Chen
(Northwestern Polytechnical University)*; BingLiang Jiao (Northwestern
Polytechnical University ); Liying Gao ( Northwestern Polytechnical
University); Peng Wang (Northwestern Polytechnical University) |
Recognition:
Retrieval |
141 |
608 |
Object-Centric
Open-Vocabulary Image-Retrieval with Sparse Features |
Hila Levi (General
Motors)*; Guy Heller (General Motors); Dan Levi (General Motors); Ethan
Fetaya (Bar Ilan University) |
Recognition:
Retrieval |
142 |
353 |
Adapting
Self-Supervised Representations to Multi-Domain Setups |
Neha Kalibhat
(University of Maryland - College Park)*; Sam Sharpe (Capital One); Jeremy
Goodsitt (Capital One); C. Bayan Bruss (Capital One); Soheil Feizi
(University of Maryland) |
Representation
Learning |
143 |
111 |
SeqCo-DETR: Sequence
Consistency Training for Self-Supervised Object Detection with Transformers |
Guoqiang Jin
(SenseTime Research)*; Fan Yang (中国科学院自动化研究所); Mingshan Sun (SenseTime
Research ); Ruyi Zhao (Tongji University); Yakun Liu (SenseTime Research);
Wei Li (SenseTime Research); Tianpeng Bao (SenseTime Research); Liwei Wu
(SenseTime Research); Xingyu ZENG (SenseTime Group Limited); Rui Zhao
(SenseTime Group Limited) |
Representation
Learning |
144 |
433 |
Variational
Autoencoders with Decremental Information Bottleneck for Disentanglement |
Jiantao Wu
(University of Surrey)*; Shentong Mo (Carnegie Mellon University); Xingshen
Zhang (University of Jinan); Muhammad Awais (University of Surrey); Sara
Ahmed (University of surrey); Zhenhua Feng (University of Surrey); Lin Wang
(University of Jinan); Xiang Yang (Zhejiang Mingyi Technology Co., Ltd.) |
Representation
Learning |
145 |
351 |
Cross-domain Semantic
Decoupling for Weakly-Supervised Semantic Segmentation |
Zaiquan Yang (Beihang
University)*; Zhanghan Ke (City University of Hong Kong); Gerhard P. Hancke
(City University of Hong Kong); Rynson W.H. Lau (City University of Hong
Kong) |
Representation
Learning |
146 |
394 |
Unifying Synergies
between Self-supervised Learning and Dynamic Computation |
Tarun Krishna (DCU)*;
Ayush K. Rai (Dublin City University); Eric Arazo (Insight Centre for Data
Analytics (DCU)); Paul Albert (Insight Centre for Data Analytics (DCU));
Alexandru F Drimbarean (Xperi); Alan Smeaton (Insight Centre for Data
Analytics, Dublin City University); Kevin McGuinness (DCU); Noel O Connor
(Home) |
Representation
Learning |
147 |
226 |
PanoMixSwap –
Panorama Mixing via Structural Swapping for Indoor Scene Understanding |
Yu-Cheng Hsieh
(National Tsing Hua University)*; Cheng Sun (National Tsing Hua University);
Suraj Dengale (National Tsing Hua University); Min Sun (NTHU) |
Scene Analysis and
Understanding |
148 |
499 |
Clustered Saliency
Prediction |
Rezvan Sherkati
(McGill University)*; James J. Clark (McGill University) |
Scene Analysis and
Understanding |
149 |
77 |
One-stage Progressive
Dichotomous Segmentation |
Jing Zhu (Samsung
Research America)*; Karim Ahmed (Samsung Research America); Wenbo Li (Samsung
Research America); Yilin Shen (Samsung Research America); Hongxia Jin
(Samsung Research America) |
Segmentation,
grouping and shape analysis |
150 |
81 |
Towards Robust
Few-shot Point Cloud Semantic Segmentation |
Yating Xu (National
University of Singapore)*; Na Zhao (SUTD); Gim Hee Lee (National University
of Singapore) |
Segmentation,
grouping and shape analysis |
151 |
815 |
Text and Click inputs
for unambiguous open vocabulary instance segmentation |
Vighnesh N Birodkar
(Google)*; Jonathan Huang (Google); Meera Hahn (Google); Irfan Essa (Georgia
Institute of Technology); Nikolai Warner (Georgia Tech) |
Segmentation,
grouping and shape analysis |
152 |
868 |
Multi-Scale Cross
Contrastive Learning for Semi-Supervised Medical Image Segmentation |
Qianying Liu
(University of Glasgow)*; Xiao Gu (Imperial College London); Paul Henderson
(University of Glasgow); Fani Deligianni (University of Glasgow) |
Segmentation,
grouping and shape analysis |
153 |
623 |
Superpixel Positional
Encoding to Improve ViT-based Semantic Segmentation Models |
Roberto Amoroso
(University of Modena and Reggio Emilia)*; Matteo Tomei (Prometeia); Lorenzo
Baraldi (University of Modena and Reggio Emilia); Rita Cucchiara (Università
di Modena e Reggio Emilia) |
Segmentation,
grouping and shape analysis |
154 |
767 |
Label-guided
Real-time Fusion Network forRGB-T Semantic Segmentation |
Zengrong Lin (Sun
Yat-sen University); Baihong Lin (University of Electronic Science and
Technology of China)*; Yulan Guo (Sun Yat-sen University) |
Segmentation,
grouping and shape analysis |
155 |
523 |
SHLS: Superfeatures
learned from still images for self-supervised VOS |
Marcelo M Santos
(UFBA)*; Jefferson Fontinele da Silva (University Federal of Maranhão);
Luciano Oliveira (UFBA) |
Segmentation,
grouping and shape analysis |
156 |
530 |
AutoSAM: Adapting SAM
to Medical Images by Overloading the Prompt Encoder |
Tal Shaharbany (Tel
Aviv University)*; Aviad Dahan (Tel Aviv University); Raja Giryes (Tel
Aviv University); Lior Wolf (Tel Aviv University, Israel) |
Segmentation,
grouping and shape analysis |
157 |
719 |
EyeGuide - From Gaze
Data to Instance Segmentation |
Jacqueline Kockwelp
(University of Münster); Joerg Gromoll (CeRA); Joachim Wistuba (Centre of
Reproductive Medicine and Andrology); Benjamin Risse (University of Münster)* |
Segmentation,
grouping and shape analysis |
158 |
908 |
Class-Imbalanced
Semi-Supervised Learning with Inverse Auxiliary Classifier |
Tiansong Jiang
(Nanjing University of Science and Technology)*; Sheng Wan (Nanjing
university of science and technology); Chen Gong (Nanjing University of
Science and Technology) |
Self-, semi-, meta-,
unsupervised learning |
159 |
899 |
C3: Cross-instance
guided Contrastive Clustering |
Mohammadreza Sadeghi
(McGill University); Hadi Hojjati (McGill University); Narges Armanfard
(McGill University; Mila - Quebec AI Institute)* |
Self-, semi-, meta-,
unsupervised learning |
160 |
676 |
BFC-BL: Few-Shot
Classification and Segmentation combining Bi-directional Feature Correlation
and Boundary constraint |
Haibiao Yang
(Guangdong University of Technology)*; Zeng Bi (Guangdong University of
Technology); Pengfei Wei (Guangdong University of Technology); Jianqi Liu
(Guangdong University of Technology) |
Self-, semi-, meta-,
unsupervised learning |
161 |
259 |
Prototype-Aware
Contrastive Knowledge Distillation for Few-Shot Anomaly Detection |
Zhihao Gu (Shanghai
Jiao Tong University)*; Taihai Yang (East China Normal University); Lizhuang
Ma (Shanghai Jiao Tong University) |
Self-, semi-, meta-,
unsupervised learning |
162 |
837 |
Domain-Adaptive
Semantic Segmentation with Memory-Efficient Cross-Domain Transformers |
Ruben Mascaro (ETH
Zurich)*; Lucas Teixeira (ETH Zurich); Margarita Chli (ETH Zurich) |
Self-, semi-, meta-,
unsupervised learning |
163 |
117 |
Detect, Augment,
Compose, and Adapt: Four Steps for Unsupervised Domain Adaptation in Object
Detection |
Mohamed Lamine
Mekhalfi (Fondazione Bruno Kessler)*; Davide Boscaini (Fondazione Bruno
Kessler); Fabio Poiesi (Fondazione Bruno Kessler) |
Self-, semi-, meta-,
unsupervised learning |
164 |
215 |
Hierarchical
Quantization Consistency for Fully Unsupervised Image Retrieval |
Guile Wu (Noah’s Ark
Lab); Chao Zhang (Toshiba Europe Limited)*; Stephan Liwicki (Toshiba Europe
Limited) |
Self-, semi-, meta-,
unsupervised learning |
165 |
297 |
Exploring the Limits
of Deep Image Clustering using Pretrained Models |
Nikolas Adaloglou
(HHU)*; Felix Michels (HHU); Hamza Kalisch (HHU); Markus Kollmann (HHU) |
Self-, semi-, meta-,
unsupervised learning |
166 |
471 |
Enhancing
Interpretable Object Abstraction via Clustering-based Slot Initialization |
Ning Gao (Bosch
Center for Artificial Intelligence (BCAI))*; Bernard Hohmann (Karlsruhe
Institute of Technology); Gerhard Neumann (Karlsruhe Institute of Technology
(KIT), Karlsruhe, Germany) |
Self-, semi-, meta-,
unsupervised learning |
167 |
240 |
StereoFlowGAN:
Co-training for Stereo and Flow with Unsupervised Domain Adaptation |
Zhexiao Xiong
(Washington University in St. Louis)*; Feng Qiao (RWTH Aachen University); Yu
Zhang (Bastian Solutions); Nathan Jacobs (Washington University in St. Louis) |
Self-, semi-, meta-,
unsupervised learning |
168 |
633 |
Multi-Target Domain
Adaptation with Class-Wise Attribute Transfer in Semantic Segmentation |
Changjae Kim (DGIST);
Seunghun Lee (DGIST)*; Sunghoon Im (DGIST) |
Transfer, low-shot,
continual, long-tail learning |
169 |
858 |
Weakly-supervised
Spatially Grounded Concept Learner for Few-Shot Learning |
Gaurav Bhatt (The
University of British Columbia)*; Deepayan Das (IIT-H); Leonid Sigal
(University of British Columbia); Vineeth N Balasubramanian (Indian Institute
of Technology, Hyderabad) |
Transfer, low-shot,
continual, long-tail learning |
170 |
12 |
RestNet: Boosting
Cross-Domain Few-Shot Segmentation with Residual Transformation Network |
Xinyang Huang
(Beijing University of Posts and Telecommunications)*; Chuang Zhu (Beijing
University of Posts and Telecommunications ); Wenkai Chen (Beijing University
of Posts and Telecommunications) |
Transfer, low-shot,
continual, long-tail learning |
171 |
18 |
Random Word Data
Augmentation with CLIP for Zero-Shot Anomaly Detection |
Masato Tamura
(Hitachi America, Ltd.)* |
Transfer, low-shot,
continual, long-tail learning |
172 |
202 |
Few-Shot Anomaly
Detection with Adversarial Loss for Robust Feature Representations |
Jae Young Lee
(KAIST)*; Wonjun Lee (University of Science and Technology ); Jaehyun Choi
(KAIST); Yongkwi LEE (ETRI); Young Seog Yoon (Electronics and
Telecommunications Research Institute) |
Transfer, low-shot,
continual, long-tail learning |
173 |
330 |
Fine-grained Few-shot
Recognition by Deep Object Parsing |
Ruizhao Zhu (Boston
University)*; Pengkai Zhu (Amazon Web Services); Samarth Mishra (Boston
University); Venkatesh Saligrama (Boston University) |
Transfer, low-shot,
continual, long-tail learning |
174 |
762 |
Novel Regularization
via Logit Weight Repulsion for Long-Tailed Classification |
Taegil Ha (Seoul
National University)*; Seulki Park (Seoul National University); Jin Young
Choi (Seoul National University) |
Transfer, low-shot,
continual, long-tail learning |
175 |
292 |
Generating
Pseudo-labels Adaptively for Few-shot Model-Agnostic Meta-Learning |
Guodong Liu (Huazhong
University of Science and Technology); Tongling Wang (Huazhong University of
Science and Technology); Shuoxi Zhang (Huazhong University of Science and
Technology); Kun He (Huazhong University of Science and Technology)* |
Transfer, low-shot,
continual, long-tail learning |
176 |
452 |
Domain-Aware
Augmentations for Unsupervised Online General Continual Learning |
Nicolas Michel
(LIGM)* |
Transfer, low-shot,
continual, long-tail learning |
177 |
534 |
Dual Feature
Augmentation Network for Generalization Zero-shot Learning |
Lei Xiang (Nanjing
University of Information Science and Technology )*; Yuan Zhou (Nanjing
University of Information Science and Technology); Haoran Duan (Durham
University); Yang Long (Durham University) |
Transfer, low-shot,
continual, long-tail learning |
178 |
264 |
Predictive
Consistency Learning for Long-Tailed Recognition |
Nan Kang (Key
Laboratory of Intelligent Information Processing of Chinese Academy of
Sciences (CAS))*; Hong Chang (Chinese Academy of Sciences); Bingpeng MA
(University of Chinese Academy of Sciences); Shutao Bai (Institute of
Computing Technology, Chinese Academy of Sciences); Shiguang Shan (Institute
of Computing Technology, Chinese Academy of Sciences); Xilin Chen (Institute
of Computing Technology, Chinese Academy of Sciences) |
Transfer, low-shot,
continual, long-tail learning |
179 |
542 |
Temporal-aware
Hierarchical Mask Classification for Video Semantic Segmentation |
Zhaochong An (ETH
Zurich); Guolei Sun (ETH Zurich)*; Zongwei WU (Univ. Bourgogne Franche-Comte,
France); Hao Tang (ETH Zurich); Luc Van Gool (ETH Zurich) |
Video analysis and
Understanding |
180 |
25 |
Motion and
Context-Aware Audio-Visual Conditioned Video Prediction |
Yating Xu (National
University of Singapore)*; Conghui Hu (National University of Singapore); Gim
Hee Lee (National University of Singapore) |
Vision and audio |
181 |
144 |
Dual Attention for
Audio-Visual Speech Enhancement with Facial Cues |
Fexiang Wang (ICT,
UCAS)*; Shuang Yang (ICT, CAS); Shiguang Shan (Institute of Computing
Technology, Chinese Academy of Sciences); Xilin Chen (Institute of Computing
Technology, Chinese Academy of Sciences) |
Vision and audio |
182 |
367 |
How Can Contrastive
Pre-training Benefit Audio-Visual Segmentation? A Study from Supervised and
Zero-shot Perspectives |
Jiarui Yu (USTC)*;
Haoran Li (University of Science and Technology of China); Yanbin Hao
(University of Science and Technology of China); Wu Jinmeng (Wuhan Institute
of Technology); Tong Xu (University of Science and Technology of China); Shuo
Wang (University of Science and Technology of China); Xiangnan He (University
of Science and Technology of China) |
Vision and audio |
183 |
139 |
Continuous Levels of
Detail for Light Field Networks |
David Li (University
of Maryland College Park)*; Brandon Yushan Feng (University of Maryland,
College Park); Amitabh Varshney (University of Maryland) |
Vision and graphics |
184 |
347 |
SRNet: Striped
Pyramid Pooling and Relational Transformer for Retinal Vessel Segmentation |
Wei Yan (College of
Computer Science and Engineering, Northwest Normal University)*; Yun Jiang
(College of Computer Science and Engineering, Northwest Normal University);
Zequn Zhang (Northwest Normal University ); Yao Yan (College of Computer
Science and Engineering, Northwest Normal University); Bingxi Liu (Northwest
Normal University) |
Vision and graphics |
185 |
451 |
Complex Scene Image
Editing by Scene Graph Comprehension |
Zhongping Zhang
(Boston University)*; Huiwen He (Boston University); Bryan Plummer (Boston
University); Zhenyu Liao (Kwai Inc); Huayan Wang (Kuaishou Technology) |
Vision and language |
186 |
314 |
GOPro: Generate and
Optimize Prompts in CLIP using Self-Supervised Learning |
Mainak Singha (Indian
Institute of Technology Bombay)*; Ankit Jha (Indian Institute of Technology
Bombay); Biplab Banerjee (Indian Institute of Technology, Bombay) |
Vision and language |
187 |
182 |
BDC-Adapter: Brownian
Distance Covariance for Better Vision-Language Reasoning |
Yi Zhang (Southern
University of Science and Technology); Ce Zhang (Carnegie Mellon University);
Zihan Liao (Southern University of Science and Technology); Yushun Tang
(Southern University of Science and Technology); Zhihai He (Southern
University of Science and Technology)* |
Vision and language |
188 |
510 |
Open-world
Text-specifed Object Counting |
Niki Amini-Naieni
(University of Oxford)*; Kiana Amini-Naieni (University of California,
Davis); Tengda Han (University of Oxford); Andrew Zisserman (University of
Oxford) |
Vision and language |
189 |
650 |
Towards Debiasing
Frame Length Bias in Text-Video Retrieval via Causal Intervention |
Burak Satar (Nanyang
Technological University)*; Hongyuan Zhu (Institute for Infocomm, Research
Agency for Science, Technology and Research (A*STAR) Singapore); Hanwang
Zhang (Nanyang Technological University); Joo-Hwee Lim (Institute for
Infocomm Research) |
Vision and language |
190 |
229 |
Weakly-Supervised
Visual-Textual Grounding with Semantic Prior Refinement |
Davide Rigoni
(University of Padua); Luca Parolari (University of Padova); Luciano Serafini
(Fondazione Bruno Kessler); Alessandro Sperduti (Università di Padova (IT));
Lamberto Ballan (University of Padova)* |
Vision and language |
191 |
596 |
Generating
Context-Aware Natural Answers for Questions in 3D Scenes |
Mohammed Munzer
Dwedari (Technical University of Munich)*; Matthias Niessner (Technical
University of Munich); Zhenyu Chen (Technical University of Munich) |
Vision and language |
192 |
378 |
Neural Feature
Filtering for Faster Structure-from-Motion Localisation |
Alexandros Rotsidis
(University of Bath)*; Yuxin Wang (École polytechnique fédérale de Lausanne);
Yiorgos Chrysanthou (CYENS Centre of Excellence); Christian Richardt (Meta) |
Vision and robotics |
193 |
647 |
Dictionary-Guided
Text Recognition for Smart Street Parking |
Deyang Zhong
(University of Washington Tacoma); Jiayu Li (University of Washington ); Wei
Cheng (University of Washington); Juhua Hu (University of Washington)* |
Vision applications
and systems |
194 |
300 |
Contrastive
Consistent Representation Distillation |
Shipeng Fu (Sichuan
University )*; Haoran Yang (Sichuan University); Xiaomin Yang (Sichuan
University) |
Vision applications
and systems |
195 |
322 |
3D Structure-guided
Network for Tooth Alignment in 2D Photograph |
Yulong Dou
(Shanghaitech)*; Lanzhuju Mei (ShanghaiTech University); Zhiming Cui (HKU);
Dinggang Shen (United Imaging Intelligence) |
Vision applications
and systems |
196 |
376 |
Adapting Generic
Features to A Specific Task: A Large Discrepancy Knowledge Distillation for
Image Anomaly Detection |
Chenkai Zhang
(Zhejiang University)*; Tianqi Du (Zhejiang University); Yueming Wang
(Zhejiang University) |
Vision applications
and systems |
197 |
385 |
Personalized Fashion
Recommendation via Deep Personality Learning |
Dongmei Mo (The Hong
Kong Polytechnic University)*; Xingxing Zou (Laboratory for Artificial
Intelligence in Design, The Hong Kong Polytechnic University); Waikeung Wong
(Institute of Textiles and Clothing, The Hong Kong Polytechnic University) |
Vision applications
and systems |
198 |
480 |
Comprehensive
Quantitative Quality Assessment of Thermal Cut Sheet Edges using
Convolutional Neural Networks |
Janek Stahl
(Fraunhofer IPA)*; Marco Huber (University of Stuttgart); Andreas Frommknecht
(Fraunhofer IPA) |
Vision applications
and systems |
199 |
614 |
FRE: A Fast Method
For Anomaly Detection And Segmentation |
Ibrahima Ndiour
(Intel)*; Ergin U Genc (Intel); Nilesh A Ahuja (Intel); Omesh Tickoo (Intel) |
Vision applications
and systems |
200 |
161 |
Long Story Short: a
Summarize-then-Search Method for Prompt-Based Long Video Question Answering |
Jiwan Chung (Yonsei
University)*; Youngjae Yu (Yonsei University) |
Visual reasoning and
logical representation |
|
|
|
|
|
|