Skip to content

About
AI Breakdown

Category: Uncategorized

CVPR 2023, highlight paper – F2-NeRF: Fast Neural Radiance Field Training with Free Camera Trajectories

In this episode we discuss F2-NeRF: Fast Neural Radiance Field Training with Free Camera Trajectories by Peng Wang, Yuan Liu, Zhaoxi Chen, Lingjie Liu, Ziwei Liu, Taku Komura, Christian Theobalt, Wenping Wang. The paper presents a new grid-based NeRF called F2-NeRF which allows arbitrary input camera trajectories and is faster to train. Existing fast grid-based…

May 12, 2023
CVPR 2023, highlight paper – Learning Human-to-Robot Handovers from Point Clouds

In this episode we discuss Learning Human-to-Robot Handovers from Point Clouds by Sammy Christen, Wei Yang, Claudia Pérez-D’Arpino, Otmar Hilliges, Dieter Fox, Yu-Wei Chao. The paper proposes the first framework to teach robots how to perform vision-based human-to-robot handovers, a crucial task for human-robot interaction. The authors leverage recent advances in realistic simulations for handovers…

May 12, 2023
CVPR 2023, highlight paper – Rodin: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion

In this episode we discuss Rodin: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion by Tengfei Wang, Bo Zhang, Ting Zhang, Shuyang Gu, Jianmin Bao, Tadas Baltrusaitis, Jingjing Shen, Dong Chen, Fang Wen, Qifeng Chen, Baining Guo. The paper proposes a 3D generative model called “Rodin” that uses diffusion models to create 3D…

May 12, 2023
CVPR 2023, highlight paper – HairStep: Transfer Synthetic to Real Using Strand and Depth Maps for Single-View 3D Hair Modeling

In this episode we discuss HairStep: Transfer Synthetic to Real Using Strand and Depth Maps for Single-View 3D Hair Modeling by Yujian Zheng, Zirong Jin, Moran Li, Haibin Huang, Chongyang Ma, Shuguang Cui, Xiaoguang Han. The paper discusses the problem of learning-based single-view 3D hair modelling, which has difficulties in collecting real image and 3D…

May 12, 2023
CVPR 2023, highlight paper – LaserMix for Semi-Supervised LiDAR Semantic Segmentation

In this episode we discuss LaserMix for Semi-Supervised LiDAR Semantic Segmentation by Lingdong Kong, Jiawei Ren, Liang Pan, Ziwei Liu. The paper proposes a semi-supervised learning framework, called LaserMix, for LiDAR semantic segmentation, leveraging the strong spatial cues of LiDAR point clouds to better exploit unlabeled data. The framework mixes laser beams from different LiDAR…

May 12, 2023
CVPR 2023, highlight paper – Neural Volumetric Memory for Visual Locomotion Control

In this episode we discuss Neural Volumetric Memory for Visual Locomotion Control by Ruihan Yang, Ge Yang, Xiaolong Wang. The paper discusses the use of legged robots for autonomous locomotion on challenging terrains using a forward-facing depth camera. Due to the partial observability of the terrain, the robot has to rely on past observations to…

May 12, 2023
CVPR 2023, highlight paper – Normal-guided Garment UV Prediction for Human Re-texturing

In this episode we discuss Normal-guided Garment UV Prediction for Human Re-texturing by Yasamin Jafarian, Tuanfeng Y. Wang, Duygu Ceylan, Jimei Yang, Nathan Carr, Yi Zhou, Hyun Soo Park. The paper presents a method to edit dressed human images and videos without the need for 3D reconstruction of dynamic clothing. The approach estimates a geometry…

May 12, 2023
CVPR 2023, highlight paper – HyperReel: High-Fidelity 6-DoF Video with Ray-Conditioned Sampling

In this episode we discuss HyperReel: High-Fidelity 6-DoF Video with Ray-Conditioned Sampling by Benjamin Attal, Jia-Bin Huang, Christian Richardt, Michael Zollhoefer, Johannes Kopf, Matthew O’Toole, Changil Kim. The paper discusses the challenges of creating a memory-efficient and high-quality 6-DoF (Six-Degrees-of-Freedom) video representation for dynamic scenes. Existing methods fail to achieve real-time rendering, high-quality rendering, and…

May 12, 2023
CVPR 2023, highlight paper – Learning Customized Visual Models with Retrieval-Augmented Knowledge

In this episode we discuss Learning Customized Visual Models with Retrieval-Augmented Knowledge by Haotian Liu, Kilho Son, Jianwei Yang, Ce Liu, Jianfeng Gao, Yong Jae Lee, Chunyuan Li. The paper proposes a framework called REACT (REtrieval-Augmented CusTomization) to build customized visual models for specific domains. Instead of using expensive pre-training, REACT retrieves relevant image-text pairs…

May 12, 2023
CVPR 2023, highlight paper – SplineCam: Exact Visualization and Characterization of Deep Network Geometry and Decision Boundaries

In this episode we discuss SplineCam: Exact Visualization and Characterization of Deep Network Geometry and Decision Boundaries by Ahmed Imtiaz Humayun, Randall Balestriero, Guha Balakrishnan, Richard Baraniuk. This paper presents a new method called SplineCam that enables exact computation of the geometry of a deep network’s (DN) mapping, including its decision boundary, without resorting to…

May 12, 2023
CVPR 2023, highlight paper – Nighttime Smartphone Reflective Flare Removal Using Optical Center Symmetry Prior

In this episode we discuss Nighttime Smartphone Reflective Flare Removal Using Optical Center Symmetry Prior by Yuekun Dai, Yihang Luo, Shangchen Zhou, Chongyi Li, Chen Change Loy. The paper proposes a method to address the problem of reflective flare in photos caused by light reflecting inside lenses and creating bright spots or a “ghosting effect”.…

May 12, 2023
CVPR 2023, highlight paper – Beyond mAP: Towards better evaluation of instance segmentation

In this episode we discuss Beyond mAP: Towards better evaluation of instance segmentation by Rohit Jena, Lukas Zhornyak, Nehal Doiphode, Pratik Chaudhari, Vivek Buch, James Gee, Jianbo Shi. The paper proposes new measures to account for duplicate predictions in instance segmentation, which the commonly used Average Precision metric does not penalize. The authors suggest a…

May 12, 2023
CVPR 2023, highlight paper – Real-Time Evaluation in Online Continual Learning: A New Hope

In this episode we discuss Real-Time Evaluation in Online Continual Learning: A New Hope by Yasir Ghunaim, Adel Bibi, Kumail Alhamoud, Motasem Alfarra, Hasan Abed Al Kader Hammoud, Ameya Prabhu, Philip H. S. Torr, Bernard Ghanem. The paper proposes a practical real-time evaluation of Continual Learning (CL) methods, which takes into account computational costs and…

May 12, 2023
CVPR 2023, highlight paper – Object Pose Estimation with Statistical Guarantees: Conformal Keypoint Detection and Geometric Uncertainty Propagation

In this episode we discuss Object Pose Estimation with Statistical Guarantees: Conformal Keypoint Detection and Geometric Uncertainty Propagation by Heng Yang, Marco Pavone. The paper proposes a two-stage object pose estimation method that uses conformal keypoint detection and geometric uncertainty propagation to endow an estimation with provable and computable worst-case error bounds. Conformal keypoint detection…

May 12, 2023
CVPR 2023 – Normalizing Flow based Feature Synthesis for Outlier-Aware Object Detection

In this episode we discuss Normalizing Flow based Feature Synthesis for Outlier-Aware Object Detection by Nishant Kumar, Siniša Šegvić, Abouzar Eslami, Stefan Gumhold. The paper proposes a novel outlier-aware object detection framework that improves on existing approaches by learning the joint data distribution of all inlier classes with an invertible normalizing flow. This ensures that…

May 12, 2023
CVPR 2023 – Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models

In this episode we discuss Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models by Jiarui Xu, Sifei Liu, Arash Vahdat, Wonmin Byeon, Xiaolong Wang, Shalini De Mello. The paper presents ODISE, a model that unifies pre-trained text-image diffusion and discriminative models to perform open-vocabulary panoptic segmentation. The approach leverages the frozen internal representations of both models…

May 12, 2023
CVPR 2023 – Egocentric Video Task Translation

In this episode we discuss Egocentric Video Task Translation by Zihui Xue, Yale Song, Kristen Grauman, Lorenzo Torresani. The paper proposes a more unified approach to video understanding tasks, specifically in the context of wearable cameras. The authors argue that the egocentric perspective of a person presents an interconnected web of tasks, such as object…

May 12, 2023
CVPR 2023 – Learning to Name Classes for Vision and Language Models

In this episode we discuss Learning to Name Classes for Vision and Language Models by Sarah Parisot, Yongxin Yang, Steven McDonagh. The paper proposes a solution to two challenges faced by large-scale vision and language models in achieving impressive zero-shot recognition performances. These challenges include sensitivity to handcrafted class names defining queries and difficulty in…

May 12, 2023
CVPR 2023 – Model-Agnostic Gender Debiased Image Captioning

In this episode we discuss Model-Agnostic Gender Debiased Image Captioning by Yusuke Hirota, Yuta Nakashima, Noa Garcia. The paper discusses the issue of gender bias in image captioning models and proposes a framework named LIBRA to mitigate such bias. Prior attempts to address this problem by focusing on people created gender-stereotypical words, and it affected…

May 12, 2023
CVPR 2023 – Magic3D: High-Resolution Text-to-3D Content Creation

In this episode we discuss Magic3D: High-Resolution Text-to-3D Content Creation by Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, Tsung-Yi Lin. The paper introduces a two-stage optimization framework called Magic3D to address the slow optimization and low-resolution image space supervision limitations of the pre-trained text-to-image…

May 12, 2023
CVPR 2023 – Self-Correctable and Adaptable Inference for Generalizable Human Pose Estimation

In this episode we discuss Self-Correctable and Adaptable Inference for Generalizable Human Pose Estimation by Zhehan Kan, Shuoshuo Chen, Ce Zhang, Yushun Tang, Zhihai He. The paper introduces a self-correctable and adaptable inference (SCAI) method to address the generalization challenge of network prediction. Utilizing human pose estimation as an example, they learn a correction network…

May 12, 2023
CVPR 2023 – Adaptive Spot-Guided Transformer for Consistent Local Feature Matching

In this episode we discuss Adaptive Spot-Guided Transformer for Consistent Local Feature Matching by Jiahuan Yu, Jiahao Chang, Jianfeng He, Tianzhu Zhang, Feng Wu. The paper proposes Adaptive Spot-Guided Transformer (ASTR), a new approach for local feature matching that models both local consistency and scale variations in a coarse-to-fine architecture. ASTR uses a spot-guided aggregation…

May 11, 2023
CVPR 2023 – Hierarchical Dense Correlation Distillation for Few-Shot Segmentation

In this episode we discuss Hierarchical Dense Correlation Distillation for Few-Shot Segmentation by Bohao Peng, Zhuotao Tian, Xiaoyang Wu, Chenyao Wang, Shu Liu, Jingyong Su, Jiaya Jia. The paper proposes a Hierarchically Decoupled Matching Network (HDMNet) for few-shot semantic segmentation (FSS), where a class-agnostic model segments unseen classes with only a few annotations. The method…

May 11, 2023
CVPR 2023 – Generating Features with Increased Crop-related Diversity for Few-Shot Object Detection

In this episode we discuss Generating Features with Increased Crop-related Diversity for Few-Shot Object Detection by Jingyi Xu, Hieu Le, Dimitris Samaras. The paper proposes a novel data generation model, based on a variational autoencoder (VAE), for training robust object detectors in few-shot settings. The model is designed to generate crops with increased crop-related diversity…

May 11, 2023
CVPR 2023 – JAWS: Just A Wild Shot for Cinematic Transfer in Neural Radiance Fields

In this episode we discuss JAWS: Just A Wild Shot for Cinematic Transfer in Neural Radiance Fields by Xi Wang, Robin Courant, Jinglei Shi, Eric Marchand, Marc Christie. The paper introduces JAWS, an optimization-driven approach to transfer visual cinematic features from a reference video clip to a newly generated clip, using implicit neural representations. The…

May 11, 2023

←Previous Page Next Page→

Proudly powered by WordPress