• Ep. 245 - Part 2 - June 11, 2024

  • 2024/06/13
  • 再生時間: 37 分
  • ポッドキャスト

Ep. 245 - Part 2 - June 11, 2024

  • サマリー

  • ArXiv Computer Vision research for Tuesday, June 11, 2024.


    00:21: NeRSP: Neural 3D Reconstruction for Reflective Objects with Sparse Polarized Images

    01:27: Beyond Bare Queries: Open-Vocabulary Object Retrieval with 3D Scene Graph

    03:14: T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text

    04:45: Benchmarking and Boosting Radiology Report Generation for 3D High-Resolution Medical Images

    06:23: FaceGPT: Self-supervised Learning to Chat about 3D Human Faces

    07:52: RecMoDiffuse: Recurrent Flow Diffusion for Human Motion Generation

    09:15: VoxNeuS: Enhancing Voxel-Based Neural Surface Reconstruction via Gradient Interpolation

    10:51: RAD: A Comprehensive Dataset for Benchmarking the Robustness of Image Anomaly Detection

    12:05: RGB-Sonar Tracking Benchmark and Spatial Cross-Attention Transformer Tracker

    13:52: MeMSVD: Long-Range Temporal Structure Capturing Using Incremental SVD

    15:15: Can Foundation Models Reliably Identify Spatial Hazards? A Case Study on Curb Segmentation

    16:56: MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance

    18:20: Open-World Human-Object Interaction Detection via Multi-modal Prompts

    20:03: Which Country Is This? Automatic Country Ranking of Street View Photos

    20:44: Needle In A Multimodal Haystack

    22:10: Is One GPU Enough? Pushing Image Generation at Higher-Resolutions with Foundation Models

    23:24: Towards Realistic Data Generation for Real-World Super-Resolution

    24:37: Unsupervised Object Detection with Theoretical Guarantees

    25:43: Embedded Graph Convolutional Networks for Real-Time Event Data Processing on SoC FPGAs

    27:45: A Framework for Efficient Model Evaluation through Stratification, Sampling, and Estimation

    29:01: Cinematic Gaussians: Real-Time HDR Radiance Fields with Depth of Field

    30:24: Minimizing Energy Costs in Deep Learning Model Training: The Gaussian Sampling Approach

    32:09: Global-Regularized Neighborhood Regression for Efficient Zero-Shot Texture Anomaly Detection

    33:52: Deep Implicit Optimization for Robust and Flexible Image Registration

    35:28: Visual Representation Learning with Stochastic Frame Prediction

    続きを読む 一部表示
activate_samplebutton_t1

あらすじ・解説

ArXiv Computer Vision research for Tuesday, June 11, 2024.


00:21: NeRSP: Neural 3D Reconstruction for Reflective Objects with Sparse Polarized Images

01:27: Beyond Bare Queries: Open-Vocabulary Object Retrieval with 3D Scene Graph

03:14: T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text

04:45: Benchmarking and Boosting Radiology Report Generation for 3D High-Resolution Medical Images

06:23: FaceGPT: Self-supervised Learning to Chat about 3D Human Faces

07:52: RecMoDiffuse: Recurrent Flow Diffusion for Human Motion Generation

09:15: VoxNeuS: Enhancing Voxel-Based Neural Surface Reconstruction via Gradient Interpolation

10:51: RAD: A Comprehensive Dataset for Benchmarking the Robustness of Image Anomaly Detection

12:05: RGB-Sonar Tracking Benchmark and Spatial Cross-Attention Transformer Tracker

13:52: MeMSVD: Long-Range Temporal Structure Capturing Using Incremental SVD

15:15: Can Foundation Models Reliably Identify Spatial Hazards? A Case Study on Curb Segmentation

16:56: MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance

18:20: Open-World Human-Object Interaction Detection via Multi-modal Prompts

20:03: Which Country Is This? Automatic Country Ranking of Street View Photos

20:44: Needle In A Multimodal Haystack

22:10: Is One GPU Enough? Pushing Image Generation at Higher-Resolutions with Foundation Models

23:24: Towards Realistic Data Generation for Real-World Super-Resolution

24:37: Unsupervised Object Detection with Theoretical Guarantees

25:43: Embedded Graph Convolutional Networks for Real-Time Event Data Processing on SoC FPGAs

27:45: A Framework for Efficient Model Evaluation through Stratification, Sampling, and Estimation

29:01: Cinematic Gaussians: Real-Time HDR Radiance Fields with Depth of Field

30:24: Minimizing Energy Costs in Deep Learning Model Training: The Gaussian Sampling Approach

32:09: Global-Regularized Neighborhood Regression for Efficient Zero-Shot Texture Anomaly Detection

33:52: Deep Implicit Optimization for Robust and Flexible Image Registration

35:28: Visual Representation Learning with Stochastic Frame Prediction

Ep. 245 - Part 2 - June 11, 2024に寄せられたリスナーの声

カスタマーレビュー:以下のタブを選択することで、他のサイトのレビューをご覧になれます。