avatar

Shaofeng Yin

Undergraduate Student
Tsinghua University
ysf22 (at) mails.tsinghua.edu


About Me

I am currently a senior undergraduate at Tsinghua University, pursuing a degree in Information and Computing Science.

In the summer of 2023, I joined THUML, working with Prof. Mingsheng Long. In the summer of 2024, I began a research internship at Lecar Lab, under the guidance of Prof. Guanya Shi. I am now visiting the Stanford Vision and Learning Lab, adviced by Prof. Jiajun Wu and Prof. C. Karen Liu.

My current research interests lie in deep learning for generalizable world modeling, humanoid control, and robotic manipulation.

I am applying for a PhD position in 2026 Fall. Please drop me an email if you are interested in my research or just want to chat!

News

Education

Experience

  1. Student Intern
    Feb. 2025 - Now

  2. Student Intern
    July - Aug. 2024

  3. Research Assistant
    Sept. 2023 - Now

Research

  1. arXiv
    Jialong Wu, Shaofeng Yin, Ningya Feng, Mingsheng Long
    arXiv preprint, 2025.
    Research question: How can we train world models to better serve downstream tasks, beyond mere transition modeling?
    Key features: RLVR training; strong improvements on language and video world models across text games, web navigation, and robot manipulation.

  2. ICML
    Shaofeng Yin*, Jialong Wu*, Siqiao Huang, Xingjian Su, Xu He, Jianye Hao, Mingsheng Long
    International Conference on Machine Learning (ICML), 2025.
    Research question: How can we build generalizable sensor-based world models across diverse environments?
    Key features: pretrained proprioceptive world model; single model for all robots; strong transferability; significant improvements in MPC and OPE on locomotion tasks.

  3. Neurlps
    Jialong Wu*, Shaofeng Yin*, Ningya Feng, Xu He, Dong Li, Jianye Hao, Mingsheng Long
    Conference on Neural Information Processing Systems (Neurlps), 2024.
    Research question: How can we leverage the advancements in scalable video generative models for developing interactive visual world models?
    Key features: pretrained visual world model; unified model for diverse robot arms; strong transferability; high efficiency; significant improvements across MPC and MBRL on manipulation tasks.

Projects

  1. A five-stage pipelined RISC-V 32-bit processor featuring interrupt and exception handling, user-mode virtual address translation via page tables, and performance enhancements through I-Cache, D-Cache (Writeback), and TLB, with additional support for peripherals such as VGA and Flash.

  2. A compact renderer featuring Next Event Estimation, supporting Glossy Material (Disney Principal BRDF), Texture Mapping, Normal Mapping, Motion Blur, Normal Interpolation, Depth of Field, and Mesh Rendering (accelerated with BVH).

Honors & Awards