|
2024.09.15 Our framework: OmniSafe have accepted by JMLR 2024 (The most popular open-source Safe RL framework).
2023.10.30 Big News! We released AI Alignment: A Comprehensive Survey.
2023.09.27 Our benchmark: Safety-Gymnasium have accepted by NeurIPS 2023 (DB Track) (The most popular open-source Safe RL benchmark).
As a newly enrolled Ph.D. student, I am striving to pursue these awards.
Sequence to Sequence Reward Modeling: Improving RLHF by Language Feedback [GitHub]
Jiayi Zhou*, Jiaming Ji*, Borong Zhang*, Juntao Dai, and Yaodong Yang.
Arxiv, 2024.
OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research [GitHub]
Jiaming Ji*, Jiayi Zhou*, Borong Zhang*, Juntao Dai, Xuehai Pan, Ruiyang Sun, Weidong Huang, Yiran Geng, Mickel Liu, and Yaodong Yang.
JMLR, 2024.
Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark [Website] [GitHub]
Jiaming Ji*, Borong Zhang*, Jiayi Zhou*, Xuehai Pan, Weidong Huang, Ruiyang Sun, Yiran Geng, Yifan Zhong, Juntao Dai, and Yaodong Yang.
Advances in Neural Information Processing Systems (NeurIPS), 2023.
AI Alignment: A Comprehensive Survey [Website]
Jiaming Ji*, Tianyi Qiu, Boyuan Chen, Borong Zhang, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Jiayi Zhou,
Zhaowei Zhang, Fanzhi Zeng, Kwan Yee Ng, Juntao Dai, Xuehai Pan, Aidan O'Gara, Yingshan Lei, Hua Xu, Brian Tse, Jie Fu, Stephen McAleer,
Yaodong Yang, Yizhou Wang, Song-Chun Zhu, Yike Guo, Wen Gao.
Arxiv, 2024.
Language Models Resist Alignment [GitHub]
Jiaming Ji*, Kaile Wang*, Tianyi Qiu*, Boyuan Chen*, Jiayi Zhou, Changye Li, Hantao Lou, Yaodong Yang.
Arxiv, 2024.
Reward Generalization in RLHF: A Topological Perspective
Tianyi Qiu*, Fanzhi Zeng*, Jiaming Ji*, Dong Yan*, Kaile Wang, Jiayi Zhou, Yang Han, Josef Dai, Xuehai Pan, Yaodong Yang.
Arxiv, 2024.
Reviewer for NeurIPS and ICLR.