Zizhao Wang

I am a PhD student in ECE at the University of Texas at Austin, advised by Prof. Peter Stone. My research focuses on reinforcement learning, LLM (agents), world models, and causal reasoning.

Previously, I completed my M.S. in CS at Columbia University, advised by Prof. Peter Allen and Prof. Itsik Pe’er, and my undergraduate studies at the University of Michigan - Ann Arbor.

I also spent time as research intern at Google, Microsoft, and Honda Research Institute.

Email / Google Scholar / LinkedIn / Twitter / CV

Selected Works

	Adversarial Reinforcement Learning for Large Language Model Agent Safety Zizhao Wang, Dingcheng Li, Vaishakh Keshava, Phillip Wallis, Ananth Balashankar, Peter Stone, Lukas Rutishauser, preprint*, 2025 paper / project page Improve LLM agent safety by co-training an attacker that creates malicious prompt injections and an agent that learns to defend against them.
	Dyn-O: Building Structured World Models with Object-Centric Representations Zizhao Wang, Kaixin Wang, Li Zhao, Peter Stone, Jiang Bian, Neurips*, 2025 paper Build structured world models with object-centric representations.
	SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions Zizhao Wang, Jiaheng Hu, Caleb Chuck, Stephen Chen, Roberto Martín-Martín, Amy Zhang, Scott Niekum, Peter Stone Neurips, 2024 paper / code / project page A unsupervised skill discovery method that induces diverse interactions* between state factors, which are often more valuable for solving downstream tasks.
	CaMP: Causal Motion Predictor for Robust Trajectory Forecasting Zizhao Wang, Chen Tang, Aolin Xu, Enna Sachdeva, Peter Stone, Teruhisa Misu preprint*, 2024 paper (to be released) Improve the robustness of motion prediction models by inferring causal relationships between agents.
	Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning Jiaheng Hu, Zizhao Wang, Peter Stone, Roberto Martín-Martín Neurips, 2024 paper / code / project page A method for learning disentangled skills where each skill component only affects one factor of the state space, so skills can be efficiently reused to solve downstream tasks.
	Building Minimal and Reusable Causal State Abstractions for Reinforcement Learning Zizhao Wang, Caroline Wang, Xuesu Xiao, Yuke Zhu, Peter Stone, AAAI, 2024 (Oral) paper / slides Improve exploration by visiting states where the agent is uncertain whether (as opposed to how) entities such as the agent or objects have some influence on each other.
	ELDEN: Exploration via Local Dependencies Jiaheng Hu, Zizhao Wang, Peter Stone, Roberto Martín-Martín Neurips, 2023 paper / poster Learn the causal relationships in the dynamics and reward functions for each task to derive a minimal, task-specific abstraction.
	Learning to Correct Mistakes: Backjumping in Long-Horizon Task and Motion Planning Yoonchang Sung, Zizhao Wang, Peter Stone CoRL, 2022 paper When task and motion planning search reaches dead-ends due to an early action, we identify and backjump to the culprit action to resample it, improving the search efficiency compared to backtracking.
	Causal Dynamics Learning for Task-Independent State Abstraction Zizhao Wang, Xuesu Xiao, Zifan Xu, Yuke Zhu, Peter Stone, ICML, 2022 (Oral) paper / poster / slides / code Learn a theoretically proved causal dynamics model that removes unnecessary dependencies between state variables and the action, so it generalizes well to unseen states.
	From agile ground to aerial navigation: Learning from learned hallucination Zizhao Wang, Xuesu Xiao, Alexander J Nettekoven, Kadhiravan Umasankar, Anika Singh, Sriram Bommakanti, Ufuk Topcu, Peter Stone, IROS, 2021 paper / poster / slides Generate cheap training data for navigation by hallucinating obstacles.
	APPLE: Adaptive Planner Parameter Learning from Evaluative Feedback Zizhao Wang, Xuesu Xiao, Garrett Warnell, Peter Stone, IROS, 2021 paper / slides Learn how to dynamically adjust planner parameters using evaluative feedback from non-expert users.
	APPLI: Adaptive Planner Parameter Learning from Interventions Zizhao Wang, Xuesu Xiao, Bo Liu, Garrett Warnell, Peter Stone, ICRA, 2021 paper / slides Learn how to dynamically adjust planner parameters using interventions from non-expert users.
	Variational Objectives for Markovian Dynamics with Backward Simulation Antonio Khalil Moretti, Zizhao Wang, Luhuan Wu∗, Iddo Drori, Itsik Pe’er, ECAI, 2020 paper A novel variational inference framework for nonlinear hidden Markov models.

Teaching and Services

Organizer of Causality for Robotics Workshop at IROS 2023
Reviewer for NeurIPS, ICML, IROS, ICRA, and RA-L
Teaching assistant for Reinforcement Learning (CS 394R), Causality and Reinforcement Learning (ECE 381V), and Artificial Intelligence (CS 343).