Category: Uncategorized
-
CVPR 2023 – Self-Supervised Video Forensics by Audio-Visual Anomaly Detection
In this episode we discuss Self-Supervised Video Forensics by Audio-Visual Anomaly Detection by Chao Feng, Ziyang Chen, Andrew Owens. The paper proposes a method for detecting inconsistencies between the visual and audio signals in manipulated videos using anomaly detection. The method trains an autoregressive model on real, unlabeled data to generate audio-visual feature sequences capturing…
-
CVPR 2023 – Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation
In this episode we discuss Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation by Chaohui Yu, Qiang Zhou, Jingliang Li, Jianlong Yuan, Zhibin Wang, Fan Wang. The paper proposes a novel and data-efficient framework for weakly incremental learning for semantic segmentation (WILSS) called FMWISS. WILSS aims to learn to segment new classes from cheap…
-
CVPR 2023 – Delving into Discrete Normalizing Flows on SO(3) Manifold for Probabilistic Rotation Modeling
In this episode we discuss Delving into Discrete Normalizing Flows on SO(3) Manifold for Probabilistic Rotation Modeling by Yulin Liu, Haoran Liu, Yingda Yin, Yang Wang, Baoquan Chen, He Wang. The paper proposes a new normalizing flow method for the SO(3) manifold, which is an important quantity in computer vision, graphics, and robotics but has…
-
CVPR 2023 – StepFormer: Self-supervised Step Discovery and Localization in Instructional Videos
In this episode we discuss StepFormer: Self-supervised Step Discovery and Localization in Instructional Videos by Nikita Dvornik, Isma Hadji, Ran Zhang, Konstantinos G. Derpanis, Animesh Garg, Richard P. Wildes, Allan D. Jepson. The paper introduces StepFormer, a self-supervised model that locates key-steps in instructional videos with no human supervision. Traditional methods require video-level human annotations,…
-
CVPR 2023 – SketchXAI: A First Look at Explainability for Human Sketches
In this episode we discuss SketchXAI: A First Look at Explainability for Human Sketches by Zhiyu Qu, Yulia Gryaditskaya, Ke Li, Kaiyue Pang, Tao Xiang, Yi-Zhe Song. The paper introduces human sketches to the landscape of Explainable Artificial Intelligence (XAI). Sketch is argued to be a “human-centered” data form that represents a natural interface to…
-
CVPR 2023 – Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
In this episode we discuss Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training by Filip Radenovic, Abhimanyu Dubey, Abhishek Kadian, Todor Mihaylov, Simon Vandenhende, Yash Patel, Yi Wen, Vignesh Ramanathan, Dhruv Mahajan. The paper discusses improvements to the contrastive pre-training pipeline for vision-language models used in zero-shot recognition problems. The authors propose a filtering strategy…
-
CVPR 2023 – Progressive Random Convolutions for Single Domain Generalization
In this episode we discuss Progressive Random Convolutions for Single Domain Generalization by Seokeon Choi, Debasmit Das, Sungha Choi, Seunghan Yang, Hyunsin Park, Sungrack Yun. The paper proposes a method called Progressive Random Convolution (Pro-RandConv) for single domain generalization, which aims to train a model with only one source domain to perform well on arbitrary…
-
CVPR 2023 – ACR: Attention Collaboration-based Regressor for Arbitrary Two-Hand Reconstruction
In this episode we discuss ACR: Attention Collaboration-based Regressor for Arbitrary Two-Hand Reconstruction by Zhengdi Yu, Shaoli Huang, Chen Fang, Toby P. Breckon, Jue Wang. The paper presents ACR, a new method for reconstructing two hands from monocular RGB images in arbitrary scenarios, addressing the challenges posed by occlusions and mutual confusion. Unlike existing methods,…
-
CVPR 2023 – Instant Domain Augmentation for LiDAR Semantic Segmentation
In this episode we discuss Instant Domain Augmentation for LiDAR Semantic Segmentation by Kwonyoung Ryu, Soonmin Hwang, Jaesik Park. I’m sorry, there is no abstract provided for me to discuss. Please provide me with the abstract or information regarding the paper you would like me to summarize.
-
CVPR 2023 – MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors
In this episode we discuss MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors by Yuang Zhang, Tiancai Wang, Xiangyu Zhang. The paper proposes a new pipeline, called MOTRv2, that improves end-to-end multi-object tracking by incorporating an extra object detector. The pipeline first adopts an anchor formulation of queries and then uses the detector to…
-
CVPR 2023 – Divide and Conquer: Answering Questions with Object Factorization and Compositional Reasoning
In this episode we discuss Divide and Conquer: Answering Questions with Object Factorization and Compositional Reasoning by Shi Chen, Qi Zhao. The paper proposes a new framework for visual reasoning inspired by human reasoning, which addresses the limitations of current methods. Existing methods rely on statistical priors and struggle with novel objects or biased question-answer…
-
CVPR 2023 – 3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification
In this episode we discuss 3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification by Jiazhao Zhang, Liu Dai, Fanpeng Meng, Qingnan Fan, Xuelin Chen, Kai Xu, He Wang. The paper proposes a framework for object goal navigation in 3D environments using two sub-policies – corner-guided exploration policy and category-aware identification policy. Unlike other approaches…
-
CVPR 2023 – GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning
In this episode we discuss GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning by Zhenyu Xie, Zaiyu Huang, Xin Dong, Fuwei Zhao, Haoye Dong, Xijin Zhang, Feida Zhu, Xiaodan Liang. The paper proposes a General-Purpose Virtual Try-ON framework, named GP-VTON, for transferring a garment onto a specific person. The proposed framework addresses…
-
CVPR 2023 – StyleAdv: Meta Style Adversarial Training for Cross-Domain Few-Shot Learning
In this episode we discuss StyleAdv: Meta Style Adversarial Training for Cross-Domain Few-Shot Learning by Yuqian Fu, Yu Xie, Yanwei Fu, Yu-Gang Jiang. The paper proposes a novel model-agnostic meta Style Adversarial training (StyleAdv) method for Cross-Domain Few-Shot Learning (CD-FSL), a task that aims to transfer prior knowledge learned on a source dataset to novel…
-
CVPR 2023 – Learning Anchor Transformations for 3D Garment Animation
In this episode we discuss Learning Anchor Transformations for 3D Garment Animation by Fang Zhao, Zekun Li, Shaoli Huang, Junwu Weng, Tianfei Zhou, Guo-Sen Xie, Jue Wang, Ying Shan. The paper presents a new anchor-based deformation model called AnchorDEF, which predicts 3D garment animation from a body motion sequence. The model deforms a garment mesh…
-
CVPR 2023 – OrienterNet: Visual Localization in 2D Public Maps with Neural Matching
In this episode we discuss OrienterNet: Visual Localization in 2D Public Maps with Neural Matching by Paul-Edouard Sarlin, Daniel DeTone, Tsun-Yi Yang, Armen Avetisyan, Julian Straub, Tomasz Malisiewicz, Samuel Rota Bulo, Richard Newcombe, Peter Kontschieder, Vasileios Balntas. The paper introduces OrienterNet, a deep neural network that can localize an image with sub-meter accuracy using 2D…
-
CVPR 2023 – NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction
In this episode we discuss NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction by Yun Yi, Haokui Zhang, Wenze Hu, Nannan Wang, Xiaoyu Wang. The paper proposes a neural architecture representation model that can be used to estimate attributes of different neural network architectures such as accuracy and latency without running actual training or…
-
CVPR 2023 – Boundary Unlearning
In this episode we discuss Boundary Unlearning by Min Chen, Weizhuo Gao, Gaoyang Liu, Kai Peng, Chen Wang. The paper proposes “Boundary Unlearning” as an efficient machine unlearning technique to enable deep neural networks (DNNs) to unlearn, or forget, a fraction of training data and its lineage. The proposed method focuses on the decision space…
-
CVPR 2023 – FreeSeg: Unified, Universal and Open-Vocabulary Image Segmentation
In this episode we discuss FreeSeg: Unified, Universal and Open-Vocabulary Image Segmentation by Jie Qin, Jie Wu, Pengxiang Yan, Ming Li, Ren Yuxi, Xuefeng Xiao, Yitong Wang, Rui Wang, Shilei Wen, Xin Pan, Xingang Wang. The paper proposes FreeSeg, a generic framework for unified, universal, and open-vocabulary image segmentation. Existing methods use specialized architectures or…
-
CVPR 2023 – Equiangular Basis Vectors
In this episode we discuss Equiangular Basis Vectors by Yang Shen, Xuhao Sun, Xiu-Shen Wei. This paper proposes a new approach for classification tasks, called Equiangular Basis Vectors (EBVs), which generate normalized vector embeddings as “predefined classifiers”. These vectors are required to be equal in status and as orthogonal as possible. By minimizing the spherical…
-
CVPR 2023 – Learning to Retain while Acquiring: Combating Distribution-Shift in Adversarial Data-Free Knowledge Distillation
In this episode we discuss Learning to Retain while Acquiring: Combating Distribution-Shift in Adversarial Data-Free Knowledge Distillation by Gaurav Patel, Konda Reddy Mopuri, Qiang Qiu. The paper introduces a framework called Learning to Retain while Acquiring, which addresses the issue of non-stationary distribution of pseudo-samples in the Adversarial Data-free Knowledge Distillation (DFKD) framework. The proposed…
-
CVPR 2023 – Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report Generation
In this episode we discuss Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report Generation by Mingjie Li, Bingqian Lin, Zicong Chen, Haokun Lin, Xiaodan Liang, Xiaojun Chang. The paper proposes a knowledge graph with dynamic structure and nodes to enhance automatic radiology reporting. Existing models that use medical knowledge graphs have limited effectiveness because…
-
CVPR 2023 – NeuDA: Neural Deformable Anchor for High-Fidelity Implicit Surface Reconstruction
In this episode we discuss NeuDA: Neural Deformable Anchor for High-Fidelity Implicit Surface Reconstruction by Bowen Cai, Jinchi Huang, Rongfei Jia, Chengfei Lv, Huan Fu. The paper proposes a new approach called Neural Deformable Anchor (NeuDA) for implicit surface reconstruction using differentiable ray casting. Unlike previous methods, NeuDA leverages hierarchical voxel grids to capture sharp…
-
CVPR 2023 – NeuFace: Realistic 3D Neural Face Rendering from Multi-view Images
In this episode we discuss NeuFace: Realistic 3D Neural Face Rendering from Multi-view Images by Mingwu Zheng, Haiyu Zhang, Hongyu Yang, Di Huang. The paper presents a new 3D face rendering model, called Neu-Face, that uses neural rendering techniques to learn accurate and physically-meaningful underlying 3D representations. It incorporates the neural BRDFs (bidirectional reflectance distribution…
-
CVPR 2023 – SAP-DETR: Bridging the Gap Between Salient Points and Queries-Based Transformer Detector for Fast Model Convergency
In this episode we discuss SAP-DETR: Bridging the Gap Between Salient Points and Queries-Based Transformer Detector for Fast Model Convergency by Yang Liu, Yao Zhang, Yixin Wang, Yang Zhang, Jiang Tian, Zhongchao Shi, Jianping Fan, Zhiqiang He. The paper proposes SAlient Point-based DETR (SAP-DETR), a new approach to object detection that treats it as a…