Kevin Qinghong Lin

Postdoctoral Researcher

Torr Vision Group
University of Oxford

Email: kevin.qh.lin [at] gmail.com

Biography

I am a Postdoctoral Researcher in University of Oxford, working with Prof. Philip Torr.

I obtained my PhD in National University of Singapore, luckily advised by Prof. Mike Shou.

I was fortunate to intern at Tencent / Meta AI / Meta Reality Labs / Microsoft Research.

I work on building multi-modal assistants from and for humans. This involves abilities like:

Perception: video understanding (VideoMind, VideoLLM-online), video-language pretraining (EgoVLP, UniVTG)
Reasoning: unified multimodal model (Show-o), reinforcement learning (Think or Not, RL in Vision)
Interaction: computer-use agents (ShowUI, UI-Vision), agents for scientist (Paper2Poster, Paper2Video), code intelligene (Code2Video, VCode)

I’m open to collaborations and discussions. Feel free to drop me an email.

News

2025 Nov: Happy be selected as DAAD AINeT fellow 2025.
2025 Nov: Give a talk at BMVA Symposium: Multimodal Large Models.
2025 Oct: Check out our newest works on creative video generation: Code2Video and Paper2Video!
2025 Sept: Paper2Poster, Think or Not got accepted by NeurIPS 2025.
2025 July: Paper2Poster is selected as an Oral by ICML Multi-Agent Systems workshop 2025.
2025 July: GUI-Narrator got accepted by ACM MM 2025.
2025 Jun: Selected for CVPR 2025 Doctoral Consortium, Thank you!
2025 May: UI-Vision got accepted by ICML 2025.
2025 May: Give a talk at Allen Institute for AI, hold by Prof. Ranjay.
2025 May: Give a talk at CMU, hold by Prof. Katerina.
2025 Apr: Give a talk at Together AI, hold by Prof. James Zou.
2025 Apr: Served as Area Chair of NeurIPS 2025.
2025 Feb: ShowUI, VLog, RoICtrl, MovieBench got accepted by CVPR 2025.
2025 Jan: Show-o got accepted by ICLR 2025. Congrats to the team!
2024 Dec: ShowUI (Oral) received Outstanding Paper Award by NeurIPS Open-World Agents workshop 2024.
2024 Nov: Recognized as NeurIPS 2024 Top Reviewers.
2024 Sept: VideoGUI (Spotlight), VideoLLM-MoD got accepted by NeurIPS 2024.
2024 Aug: AssistGPT got accepted by HCMA@ACM MM 2024 as Best Demo Paper.
2024 July: MovieSeq got accepted by ECCV 2024.
2024 Jun: EgoVLP received Egocentric Vision (EgoVis) Distinguished Paper Award.
2024 May: Recognized as CVPR 2024 Outstanding Reviewers.
2024 Feb: VideoLLM-online, SparseFormer got accepted by CVPR 2024.
2023 Sept: VisorGPT got accepted by NeurIPS 2023.
2023 Aug: EgoVLP received PREMIA Best Student Paper Award (Gold award).
2023 July: UniVTG, EgoVLPv2, TL;DR got accepted by ICCV 2023.
2023 Mar: All-in-one, Afformer got accepted by CVPR 2023.
2022 Sept: EgoVLP (Spotlight) got accepted by NeurIPS 2022.
2022 Aug: Joined Show Lab @ NUS to start my Ph.D. journey!
2022 Jun: EgoVLP won Double Champions of Joint 1st Ego4D and 10th EPIC Workshop, CVPR 2022. [News]

Selected Publications [Google Scholar]

† indicates equal contribution. Denotes student I mentored.

	VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation Kevin QH. Lin†, Yuhao Zheng†, Hangyu Ran†, Dantong Zhu, Dongxing Mao, Linjie Li, Philip Torr, Alex JP. Wang. Preprint, 2025 [project] [paper] [code] [demo]
	Paper2Video: Automatic Video Generation from Scientific Papers Zeyu Zhu†, Kevin QH. Lin†, Mike Z. Shou. Preprint, 2025 [project] [paper] [code] [dataset] 1.4K github stars.
	Code2Video: A Code-centric Paradigm for Educational Video Generation Yanzhe Chen†, Kevin QH. Lin†, Mike Z. Shou. Preprint, 2025 [project] [paper] [code] [dataset] [twitter] 800+ github stars.
	Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers Wei Pang†, Kevin QH. Lin†, Xiangru Jian†, Xi He, Philip Torr. NeurIPS D&B, 2025 ICML MAS workshop, 2025. Oral [paper] [code] [project] [datasets] [twitter] [demo] 2.8K github stars.
	Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models Jiaqi Wang†, Kevin QH. Lin†, James Cheng, Mike Z. Shou. NeurIPS, 2025 [paper] [code] [huggingface]
	VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning Ye Liu†, Kevin QH. Lin†, Chang Wen Chen, Mike Z. Shou. Preprint, 2025 [paper] [code] [dataset] [project] [demo]
	ShowUI: One Vision-Language-Action Model for GUI Visual Agent Kevin QH. Lin, Linjie Li, Difei Gao, Zhengyuan Yang, Shiwei Wu, Zechen Bai, Stan WX. Lei, Lijuan Wang, Mike Z. Shou. CVPR, 2025 NeurIPS OWA workshop, 2024. Oral [paper] [code] [huggingface] [dataset] [demo] Outstanding Paper Award, NeurIPS Open-World Agents Workshop 2024. The model has been downloaded for over 240,000 times. 1.5K github stars
	VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary Kevin QH. Lin, Mike Z. Shou. CVPR, 2025 [paper] [code]
	VideoGUI: A Benchmark for GUI Automation from Instructional Videos Kevin QH. Lin, Linjie Li, Difei Gao, Qinchen Wu, Mingyi Yan, Zhengyuan Yang, Lijuan Wang, Mike Z. Shou. NeurIPS D&B, 2024. Spotlight [paper] [code] [project]
	Learning Video Context as Interleaved Multimodal Sequences Kevin QH. Lin, Pengchuan Zhang, Difei Gao, Xide Xia, Joya Chen, Ziteng Gao, Jinheng Xie, Xuhong Xiao, Mike Z. Shou. ECCV, 2024 [paper] [code]
	UniVTG: Towards Unified Video-Language Temporal Grounding Kevin QH. Lin, Pengchuan Zhang, Joya Chen, Shraman Pramanick, Difei Gao, Alex JP. Wang, Rui Yan, Mike Z. Shou. ICCV, 2023 [paper] [code] [demo]
	Egocentric Video-Language Pretraining Kevin QH. Lin, Alex JP. Wang, M. Soldan, M. Wray, R. Yan, Eric ZC. Xu, D. Gao, R. Tu, W. Zhao, W. Kong, C. Cai, H. Wang, D. Damen, B. Ghanem, W. Liu, Mike Z. Shou. NeurIPS, 2022. Spotlight (1.7%) [paper] [code] [project] [poster] [media] EgoVis Distinguished Paper Award & PREMIA Best Student Paper Award 2023. Double champions in Ego4D & Epic-Kitchens CVPR 2022 challenges.

Honors

DAAD AINeT Fellowship

2025
CVPR Doctoral Consortium

2025
Outstanding Paper Award, NeurIPS Open-World Agents

2024
NeurIPS Top Reviewers

2024
Best Demo Paper Award, ACM Multimedia HCMA

2024
Egocentric Vision (EgoVis) Distinguished Paper Award

2024
CVPR Outstanding Reviewers (Top 2%)

2024
PREMIA Best Student Paper Awards, Gold Award

2023
NeurIPS Scholar Award

2022
Tencent Rhino-Bird Research Scholarship, Second Prize

2022
1st Place on Ego4D - Object State Change Classiﬁcation Challenge, CVPR

2022
1st Place on EPIC-Kitchens - Multi-Instance Retrieval Challenge, CVPR
2022
Show Lab Annual Award

2022, 2024
China National Scholarship

2018, 2021

Service

Area Chair: NeurIPS 2025.
Workshop Organizer: Open Multimodal Gathering @ NUS; Multimodal Video Agent @ CVPR 25.
Conference Reviewer: CVPR (2024 Outstanding Reviewers), ICCV, ECCV, NeurIPS (2024 Top Reviewers), ICML, ICLR, etc.
Journal Reviewer: TPAMI, IJCV, TMLR, TNNLS, TMM, etc.
Teaching Assistant: EE6934, EE6733, EE4212
Co-organizer of The AI Talks.