Hi! This is Yilun Liu, a current PhD student in Computer Science at LMU Munich, supervised by Prof. Dr. Volker Tresp, where I study Language Models with a focus on their mechanistic interpretability, architectural designs, and post-training applications.
My research interests lie in understanding the capabilities and limitations of language models, and from there developing new methods and solutions.
Prior to this, I obtained my M.Sc. in Data Engineering and Analytics at the Technical University of Munich, and my B.E. in Computer Science and Technology at Xi'an Jiaotong University.
We are hiring!
We are looking for talented students to join our ambitious projects on recursive self-improvement of LLM agents and mechanistic interpretability of such systems. We are also regularly offering thesis and guided research opportunities at TRESP Lab. Please feel free to contact me if you are interested.
Selected Publications
arXiv preprint arXiv:2606.21645
- Evaluation protocol of binomial ordering preference for LLMs, with a dataset of 600 binomial pairs across 8 languages
- LLMs behaviorally align with the empirically preferred direction more reliably than the strength, and that strength can be representationally located and manipulated
arXiv preprint arXiv:2604.18519, Accepted at ACL 2026
- SIREN, a plug-and-play component that harnesses LLM internal representations for harmfulness detection
- Outperforming dedicated safety guardrails in performance, generalization, and efficiency
arXiv preprint arXiv:2604.00801
- A Routing-Free MoE architecture, eliminating routers, Softmax, TopK, and hard-coded load balancing
- A unified, adaptive load-balancing framework that jointly optimizes token- and expert-balancing
- Experiments and analyses demonstrating the improvements of Routing-Free MoE over baselines
Findings of the Association for Computational Linguistics: EACL 2026, 4439-4457
- Dynamics between memory vectors in experts and expert vectors in routers when PEFTing to MoE LLMs
- A unified framework for integrating PEFT with MoE LLMs
- PERFT as a family of adaptation strategies, with extensive experiments validating effectiveness and scalability
Findings of the Association for Computational Linguistics: ACL 2024, 4666–4682
- A lightweight plug-and-play text classification framework using LLM's internal representations
- Consistently outperforming conventional methods in performance, efficiency, and interpretability
Educational Background
10.2025 – present
Ph.D. Student (in progress)
10.2021 –
M.Sc. Data Engineering and Analytics
Department of Informatics, Technical University of Munich
Munich, Bavaria, Germany
Munich, Bavaria, Germany
09.2017 – 07.2021
B.E. Computer Science and Technology
Faculty of Electronic and Information Engineering, Xi'an Jiaotong University
Xi'an, Shaanxi, China
Xi'an, Shaanxi, China
09.2016 – 07.2017
Pre-university Education, Honors Youth Program
Qian Xuesen Honors College, XJTU & Tianjin Nankai High School
Xi'an, Shaanxi; Tianjin, China
Xi'an, Shaanxi; Tianjin, China
Academic Service
Reviewer
ICLR (2025, 2026) · COLM (2025, 2026) · ACL Rolling Review (multiple cycles) · IC2S2 (2025)