Jin Wang
Jin Wang
Home
Education
Publications
Light
Dark
Automatic
Multimodal Understanding
Fast-dDrive: Efficient Block-Diffusion VLM for Autonomous Driving
Kewei Zhang
,
Jin Wang
,
Sensen Gao
,
Chengyue Wu
,
Yulong Cao
,
Songyang Han
,
Boris Ivanovic
,
Langechuan Liu
,
Marco Pavone
,
Song Han
,
Daquan Zhou
,
Enze Xie
PDF
Fast-dVLM: Efficient Block-Diffusion VLM via Direct Conversion from Autoregressive VLM
Chengyue Wu
,
Shiyi Lan
,
Yonggan Fu
,
Sensen Gao
,
Jin Wang
,
Jincheng Yu
,
Jose M. Alvarez
,
Pavlo Molchanov
,
Ping Luo
,
Song Han
,
Ligeng Zhu
,
Enze Xie
PDF
Cross-modal Identity Mapping: Minimizing Information Loss in Modality Conversion via Reinforcement Learning
Haonan Jia
,
Shichao Dong
,
Xin Dong
,
Zenghui Sun
,
Jin Wang
,
Jinsong Lan
,
Xiaoyong Zhu
,
Bo Zheng
,
Kaifu Zhang
PDF
INTER: Mitigating Hallucination in Large Vision-Language Models by Interaction Guidance Sampling
Hallucinations in large vision-language models (LVLMs) pose significant challenges for real-world applications, as LVLMs may generate …
Xin Dong
,
Shichao Dong
,
Jin Wang
,
Jing Huang
,
Li Zhou
,
Zenghui Sun
,
Lihua Jing
,
Jingsong Lan
,
Xiaoyong Zhu
,
Bo Zheng
PDF
Cite
Diagnosing the Compositional Knowledge of Vision Language Models from a Game-Theoretic View
Compositional reasoning capabilities are usually considered as fundamental skills to characterize human perception. Recent studies show …
Jin Wang
,
Shichao Dong
,
Yapeng Zhu
,
Kelu Yao
,
Weidong Zhao
,
Chao Li
,
Ping Luo
PDF
Cite
Code
Project
Cite
×