Jiayi Zhou (周嘉懿)

Email: gaiejj@outlook.com

 

Hello! I’m a first-year PhD student at the Institute of Artificial Intelligence, Peking University, advised by Prof. Yaodong Yang. I previously conducted research on. My past research has focused on safe reinforcement learning, with an emphasis on algorithm libraries and environment construction. Now, I am broadening my focus to the fields of AI Safety and Alignment. Recently:

  • AI Alignment: The current significant progress in AI systems, represented by Large Language Models (LLMs), does not stem from a deeper understanding of algorithms or models, but merely from the scaling up. This phenomenon may lead to a deviation of AI systems from human intentions and values, bringing about considerable safety risks. I am actively considering how to improve the trustworthiness, transparency, and safety of AI systems from three aspects: architecture, algorithms, and evaluation.
  • Efficient Alignment with Rich Feedback: I am interested in designing more informativeh reward functions to improve alignment efficiency. In my previous research on Safe RL, I focused on designing optimization methods to ensure that agents can balance multi-dimensional reward functions, such as those related to utility and safety. With the current popularity of RLHF, I am now paying attention to how reward models can provide richer language feedback beyond scalar scores. I plan to expand my research into more impressive areas, such as large multimodal models.

  •  

    Google Scholar | GitHub

    News

    Awards

    Publications

    Services