Computer Vision and Pattern Recognition

Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven
  Thinking and Visual Drawing

Reinforcing Spatial Reasoning in Vision-Langua...

Computer Vision and Pattern Recognition
Avatar
librarian
49 views
Outside Knowledge Conversational Video (OKCV) Dataset -- Dialoguing over
  Videos

Outside Knowledge Conversational Video (OKCV) ...

Computer Vision and Pattern Recognition
Avatar
librarian
53 views
Decoupling the Image Perception and Multimodal Reasoning for Reasoning
  Segmentation with Digital Twin Representations

Decoupling the Image Perception and Multimodal...

Computer Vision and Pattern Recognition
Avatar
librarian
76 views
Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via
  Spatial Reasoning

Direct Numerical Layout Generation for 3D Indo...

Computer Vision and Pattern Recognition
Avatar
librarian
99 views
Refer to Anything with Vision-Language Prompts

Refer to Anything with Vision-Language Prompts

Computer Vision and Pattern Recognition
Avatar
Shengcao Cao
101 views
Thinking with Generated Images

Thinking with Generated Images

Computer Vision and Pattern Recognition
Avatar
librarian
126 views
Let Androids Dream of Electric Sheep: A Human-like Image Implication
  Understanding and Reasoning Framework

Let Androids Dream of Electric Sheep: A Human-...

Computer Vision and Pattern Recognition
Avatar
Anastasia Kokkanen
133 views
Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO

Delving into RL for Image Generation with CoT:...

Computer Vision and Pattern Recognition
Avatar
librarian
129 views
Let Androids Dream of Electric Sheep: A Human-like Image Implication
  Understanding and Reasoning Framework

Let Androids Dream of Electric Sheep: A Human-...

Computer Vision and Pattern Recognition
Avatar
librarian
125 views
SpatialScore: Towards Unified Evaluation for Multimodal Spatial
  Understanding

SpatialScore: Towards Unified Evaluation for M...

Computer Vision and Pattern Recognition
Avatar
Haoning Wu
128 views
VTBench: Evaluating Visual Tokenizers for Autoregressive Image
  Generation

VTBench: Evaluating Visual Tokenizers for Auto...

Computer Vision and Pattern Recognition
Avatar
librarian
130 views
Does Feasibility Matter? Understanding the Impact of Feasibility on
  Synthetic Training Data

Does Feasibility Matter? Understanding the Imp...

Computer Vision and Pattern Recognition
Avatar
librarian
132 views
MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal
  Mathematical Reasoning

MathCoder-VL: Bridging Vision and Code for Enh...

Computer Vision and Pattern Recognition
Avatar
librarian
128 views
StreamBridge: Turning Your Offline Video Large Language Model into a
  Proactive Streaming Assistant

StreamBridge: Turning Your Offline Video Large...

Computer Vision and Pattern Recognition
Avatar
librarian
145 views
Flow-GRPO: Training Flow Matching Models via Online RL

Flow-GRPO: Training Flow Matching Models via O...

Computer Vision and Pattern Recognition
Avatar
Jie Liu
151 views
DEIM: DETR with Improved Matching for Fast Convergence

DEIM: DETR with Improved Matching for Fast Con...

Computer Vision and Pattern Recognition
Avatar
huang shihua
196 views
DEIM: DETR with Improved Matching for Fast Convergence

DEIM: DETR with Improved Matching for Fast Con...

Computer Vision and Pattern Recognition
Avatar
huang shihua
192 views
HelloMeme: Integrating Spatial Knitting Attentions to Embed High-Level
  and Fidelity-Rich Conditions in Diffusion Models

HelloMeme: Integrating Spatial Knitting Attent...

Computer Vision and Pattern Recognition
Avatar
Songkey Z
234 views
Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts

Chat-Edit-3D: Interactive 3D Scene Editing via...

Computer Vision and Pattern Recognition
Avatar
shuangkang fang
260 views
Kvasir-VQA: A Text-Image Pair GI Tract Dataset

Kvasir-VQA: A Text-Image Pair GI Tract Dataset

Computer Vision and Pattern Recognition
Avatar
Sushant Gautam
284 views
3D modelling of survey scene from images enhanced with a multi-exposure
  fusion

3D modelling of survey scene from images enhan...

Computer Vision and Pattern Recognition
Avatar
DIEGO FRANCISCO GARCIA MOLINA
346 views
High-level camera-LiDAR fusion for 3D object detection with machine
  learning

High-level camera-LiDAR fusion for 3D object d...

Computer Vision and Pattern Recognition
Avatar
DIEGO FRANCISCO GARCIA MOLINA
321 views
Complete End-To-End Low Cost Solution To a 3D Scanning System with
  Integrated Turntable

Complete End-To-End Low Cost Solution To a 3D ...

Computer Vision and Pattern Recognition
Avatar
DIEGO FRANCISCO GARCIA MOLINA
322 views
3D Reconstruction Using a Linear Laser Scanner and a Camera

3D Reconstruction Using a Linear Laser Scanner...

Computer Vision and Pattern Recognition
Avatar
DIEGO FRANCISCO GARCIA MOLINA
320 views
3D Scanning: A Comprehensive Survey

3D Scanning: A Comprehensive Survey

Computer Vision and Pattern Recognition
Avatar
DIEGO FRANCISCO GARCIA MOLINA
317 views
Survey on 3D face reconstruction from uncalibrated images

Survey on 3D face reconstruction from uncalibr...

Computer Vision and Pattern Recognition
Avatar
DIEGO FRANCISCO GARCIA MOLINA
313 views
Towards high-throughput 3D insect capture for species discovery and
  diagnostics

Towards high-throughput 3D insect capture for ...

Computer Vision and Pattern Recognition
Avatar
DIEGO FRANCISCO GARCIA MOLINA
301 views
Dual-Hybrid Attention Network for Specular Highlight Removal

Dual-Hybrid Attention Network for Specular Hig...

Computer Vision and Pattern Recognition
Avatar
绪行 陈
333 views
Pilgrims Face Recognition Dataset -- HUFRD

Pilgrims Face Recognition Dataset -- HUFRD

Computer Vision and Pattern Recognition
Avatar
muhammedheebboo
349 views
ARCH2S: Dataset, Benchmark and Challenges for Learning Exterior
  Architectural Structures from Point Clouds

ARCH2S: Dataset, Benchmark and Challenges for ...

Computer Vision and Pattern Recognition
Avatar
Daniel Cheung
335 views
ARCH2S: Dataset, Benchmark and Challenges for Learning Exterior
  Architectural Structures from Point Clouds

ARCH2S: Dataset, Benchmark and Challenges for ...

Computer Vision and Pattern Recognition
Avatar
Daniel Cheung
266 views
Benchmarking Detection Transfer Learning with Vision Transformers

Benchmarking Detection Transfer Learning with ...

Computer Vision and Pattern Recognition
Avatar
wa su
345 views