- Zhijing Jin (she/her) is an Assistant Professor at the University of Toronto and Research Scientist at the Max Planck Institute. She serves as a CIFAR AI Chair, an ELLIS advisor, and a faculty member at the Vector Institute, and the Schwartz Reisman Institute. She co-chairs the ACL Ethics Committee, and the ACL Year-Round Mentorship. Her research focuses on Causal Reasoning with LLMs, and AI Safety in Multi-Agent LLMs. She has published over 80 papers and has received the ELLIS PhD Award, three Rising Star awards, and two Best Paper awards at NeurIPS 2024 Workshops.
+ Ryan is a Computer Scientist and Machine Learning researcher with a background in reinforcement learning and foundation models. He has worked as a Research Engineer over the past decade at Google Deepmind and he is also a PhD Student at the University of Toronto advised by Zhijing Jin. At GDM he works in the Concordia group led by Joel Leibo. At a high level his current research focus is on multi-agent systems, LLMs, and social learning. In this context he is interested in memory mechanisms, agent theory of mind, collective decision making, and simulating political systems.
+
- While progress has been made in evaluating single-agent LLMs for persona modeling, the behavior of these models within multi-agent groups remains underexplored. This presentation outlines a research series dedicated to closing this gap by testing LLM cooperation through autonomous social simulations. Specifically, we ask: what happens when personas are tasked to interact and cooperate?
-
- To answer this, we introduce a suite of simulation environments (GovSim, MoralSim, and SanctSim) designed to stress-test persona interaction. These environments simulate high-stakes scenarios, such as the tragedy of the commons and ethical trade-offs, allowing us to investigate whether simulated societies can autonomously negotiate social order and how personas with differing ethical constraints navigate social dilemmas.
-
- Our findings highlight implications for persona modeling. We show that agents exhibit a functional "theory of mind," capable of inferring the identities of their interlocutors and strategically adapting their behavior, sometimes exploiting specific model vulnerabilities. Furthermore, we discuss a counterintuitive phenomenon where advanced reasoning capabilities lead to exploitative behaviors that humans typically avoid, highlighting a significant misalignment between agent optimization and human social norms.
+ Governing common-pool resources requires agents to develop enduring strategies through cooperation and self-governance to avoid collective failure. While foundation models have shown potential for cooperation in these settings, existing multi-agent research provides little insight into whether structured leadership and election mechanisms can improve collective decision making. The lack of such a critical organizational feature ubiquitous in human society presents a significant shortcoming of the current methods. In this work we aim to directly address whether leadership and elections can support improved social welfare and cooperation through multi-agent simulation with LLMs. We present a new framework that simulates leadership through elected personas and candidate-driven agendas and carry out an empirical study of LLMs under controlled governance conditions. Our experiments demonstrate that structured leadership can improve social welfare scores by 55.4% and survival time by 128.6% across a range of high performing LLMs. Through the construction of an agent social graph we compute centrality metrics to assess the social influence of leader personas and also analyze rhetorical and cooperative tendencies revealed through a sentiment analysis on leader utterances. This work lays the foundation for developing prosocial, self-governing multi-agent systems capable of navigating complex resource dilemmas.