Submission deadline: March 1st, 2026
This will be the second installment of a special issue of Royal Society (Philosophical Transactions A) on how multiple forms of "world models" contribute to different forms of intelligence in both natural and artificial systems. This colleciton will be expanded to include a focus on how world modeling systems can enable different forms of "agency," (causal) reasoning, and planning.
We have been witnessing unprecedented advances in machine learning (ML), whose progress on the enduring problems of artificial intelligence (AI) is such that some have suggested we are about to undergo a “4th industrial revolution,” and perhaps even a period of change so rapid as to constitute a “technological singularity” [1], [2]. While there are clearly reasons for optimism with respect to the potential of advancing AI/ML, others emphasize the dangers associated with such systems, including the possibility that AIs may escape our control, and could even constitute a source of “existential threat” if they compete with us as rival agents [3], [4], [5]. Multiple factors contribute to these different stances on the promise and peril of AI, ranging from beliefs about the fundamental nature of intelligence [6], [7], [8], to the potentially unique cognitive capacities of humans [9], [10], and even political stances [11] .
This colection is dedicated to exploring different views on the nature of intelligence, with an emphasis on the nature(s) of agency, selfhood, causal inference, (active) (meta-)learning, planning, and the kinds of sophisticated reasoning abilities that the late, great Daniel Kahneman described as “System 2 cognition” [12], [13]. That is, the power of human (and non-human) minds appears to depend on multiple forms of intelligence. “System 1” was described as a source of fast and efficient (and largely unconscious) processing, and implicit world modeling (c.f. amortized inference). “System 2” was described as a slower and more deliberate/effortful mode of cognizing involving explicit (and potentially causal) reasoning. Much has been written about how LLMs do, or do not, mimic the System 1 behaviors of human cognition, but much less is understood about how these machines conduct System 2 thinking, and how this compares to the human case. As many have suggested, reverse-engineering the ways in which evolution (and development) endows us with various System 2 capacities may be useful (and perhaps necessary) for achieving the great goal of artificial general intelligence: creating systems that can perform the full range (and more) of functions capable for human minds. If we could develop such systems, the implications for science and society (and civilization) may be difficult to overstate.
A deeper understanding of agency is clearly timely, as many are considering this to be the “year[/decade] of the agent,” not just with respect to the capacities of AI agents to provide value in multiple domains (ranging from personal assistants to artificial scientists), but also in terms of the functionalities enabled by “agentic workflows.” Agency might be even more important when we consider the revolution that appears to be taking place in the field of robotics, where autonomous functioning is essential for the real-world deployment of robots, both in terms of their own capacities as well as their ability to navigate a world populated by other agents (e.g. human beings). Further, agency is at the core of many of our most well-considered proposals for understanding the fundamental nature of intelligence, ranging from characterizing systems based on their ability to achieve goals across a broad range of environments [6], to more recent proposals centered on the ability of systems to learn above and beyond their “core knowledge” (for which agentic understanding has been suggested to be an essential prior for bootstrapping intelligent minds) [7].
However, it remains unclear precisely what people mean when they use the word, “agent,” and “agency.” For example, any reinforcement learning system could be understood as an agent in terms of being governed by a reward function that adjusts its behavior. This would be consistent with the increasing centrality of the concept of agency in biology [14], [15], which considers systems in terms of their goal-oriented behavior, grounded in the implicit ‘values’ associated with evolutionary fitness: survival and reproduction. Some have suggested that the creation of systems capable of the kinds of agency found in biology is the path forward for realizing the kinds of intelligence needed for robust real-world deployment [16]. Others have suggested that creating such biologically-inspired systems may be the greatest possible threat for humanity, and that instead we ought to engineer non-agentic “AI scientists” [17]. Along these lines, this special issue will explore the potential promise and perils of multiple kinds of biological agency, ranging from the emergent objectives/behavior of simple systems, to agents capable of planning with depth into possible futures [18]. With respect to promise and peril, while agents with more sophisticated planning abilities could be more reliable in their alignment properties, they may also have the potential for greater degrees of divergence from human values.
Richer notions of agency can be found in psychology and philosophy, which to varying degrees emphasize concepts of “intentionality” (e.g. as a conjunction of beliefs and desires), or emphasize the importance of explicitly representing goals (e.g. as counterfactual world configurations) [19], [20], [21]. Further, complexities (and functionalities) arise when agency is considered in terms of more sophisticated capabilities such as meta-cognition and self-modeling/awareness. In AI/ML, a reinforcement learning language model could be understood as a kind of agent, but the potential capabilities (and risks) associated with these systems would vary greatly depending on the kinds of agency we are considering. This semantic ambiguity is of more than just academic concern, as the particular varieties of agents we are considering may determine the extent to which systems have properties that could change economies (e.g. helping with the eldercare crisis facing multiple nations), societies (e.g. either threatening or strengthening political orders), and according to some perspectives, the fate of humanity (e.g. competitive exclusion among species attempting to occupy the same niche; rival dynamics from emergent instrumental drives; etc.).
Due to these diverging perspectives, at this moment in history we are currently facing a situation in which (potentially escalating) conflict may arise between opposing positions arguing that we should either attempt to advance these systems as rapidly as possible for their potential to address challenges and promote abundance [11], or “pause” further developments and potentially even outlaw the creation of agentic AI [22]. Thus, it is imperative that we develop a more precise understanding of agents and agency as they might be conceptualized in not just AI/ML, but also as these concepts might be informed by the disciplines of psychology, biology, and philosophy. Indeed, we should not necessarily expect a single definition of agency to be adequate for informing engineering and policy decisions, and instead may need to develop a more fine-grained lexicon where we distinguish among multiple subtypes of agentic phenomena, with multiple implications.
Towards this end, we believe it is also essential that we develop more precise and richer conceptualizations of not just agency, but “selves” and “selfhood” as understood by multiple disciplines [15], [23]. To what extent do present and near-future AI/ML systems have which kinds of self-(world-)models, and what is afforded by these different varieties of self-understanding? For example, DeepMind’s MuZero was found to achieve superhuman performance via self-play [24], and “dreaming” architectures are capable of planning via imaginative simulations without real-world interactions [25]. To what extent could richer self-representations allow for even greater performance from self-interacting systems?
Further, just as agents/agency and selves/selfhood are deeply mutually-informing, we also need to understand how these concepts relate to various notions of causation [26], [27], [28]. For example, if agency is understood in terms of intentionality, then it may also be understood in terms of self-causation with respect to actions, with different properties made possible to the degree that different forms of ‘conscious’ processing are involved. Some have even suggested that “free will” may be required for achieving artificial general intelligence [15], in terms of systems capable of self-governance with self-causation, to the extent we are capable of creating such minds in the years and decades to come. Indeed, it has even been suggested that “System 2” cognition may require ‘conscious’ processing [13]. Such proposals would also be consistent with the functional properties that appear to be afforded by the informational bottlenecks of sparse, hierarchical workspace architectures [29], [30], [31]. Such architectures may afford not only more efficient inference and learning with greater generalization potential, but also may enhance the intelligence and reliability of these systems by virtue of affording greater degrees of adaptive attention, (self-)interpretability, meta-cognition, and coherence (and hopefully wisdom) with respect to self-reflection, planning, and acting in the world.
This special issue invites contributions focused on the following questions (but with additional topics also welcomed if relevant):
1. What is the range of agentic/selfhood/reasoning phenomena that we might associate with which kinds of systems, and what functional properties do they provide with respect to inference and learning?
2. Which kinds of agency/selfhood/reasoning might be associated with which kinds of world-modeling (and potentially ‘conscious’?) phenomena?
3. Which kinds of agency/selfhood/reasoning might be associated with LLMs, and to what extent might this change with further-proposed technological developments (e.g. integration with robotics)?
4. Which kinds of agency/selfhood/intelligence are characteristic of all life, ranging from animals to plants and fungi, and even individual cells, as well as with novel forms of biotic and collective intelligences such as biohybrid robots, organoids, xenobots, companies/corporations, economic systems, and even entire societies?
5. What kinds of risks are associated with different kinds of agentic and (causal) reasoning systems, and to the extent that we can decipher how these agentic/selfhood/reasoning capacities are implemented within different kinds of systems, to what extent might this afford what degrees of prediction and control over increasingly advanced AIs?
To support further exploration of these topics, there will be a podcast series, discussion forum, and (hybrid in-person and virtual) workshop planned for 2026-2027 , where we will invite authors and other domain experts to participate in structured discussions over three days. This workshop will also involve contributors from the related collection on the topic of “world models” as they may be understood in the context of both naturally and artificially intelligent systems.
Editors
We are deeply grateful to Professor Kahneman for helping to inspire this special issue, and for providing his warm support during the planning process.
We hope to honor his memory with the work to come.
References
References
[1] M. Suleyman, The Coming Wave: Technology, Power, and the Twenty-first Century’s Greatest Dilemma. Crown, 2023.
[2] Y. N. Harari, Nexus: A Brief History of Information Networks from the Stone Age to AI. Random House Publishing Group, 2024.
[3] N. Bostrom, Superintelligence: Paths, Dangers, Strategies. Oxford University Press, 2014.
[4] B. Christian, The Alignment Problem: Machine Learning and Human Values, 1st edition. New York, NY: W. W. Norton & Company, 2020.
[5] D. “davidad” Dalrymple et al., “Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems,” Jul. 08, 2024, arXiv: arXiv:2405.06624. doi: 10.48550/arXiv.2405.06624.
[6] S. Legg and M. Hutter, “A collection of definitions of intelligence,” Frontiers in Artificial Intelligence and applications, vol. 157, p. 17, 2007.
[7] F. Chollet, “On the Measure of Intelligence,” arXiv, arXiv:1911.01547, Nov. 2019. doi: 10.48550/arXiv.1911.01547.
[8] S. Russell, Human Compatible: Artificial Intelligence and the Problem of Control. Penguin, 2019.
[9] M. Mitchell, Artificial Intelligence: A Guide for Thinking Humans. Farrar, Straus and Giroux, 2019.
[10] G. Marcus and E. Davis, Rebooting AI: Building Artificial Intelligence We Can Trust. Knopf Doubleday Publishing Group, 2019.
[11] M. Andreessen, “The Techno-Optimist Manifesto,” Andreessen Horowitz. Accessed: Jun. 21, 2024. [Online]. Available: https://a16z.com/the-techno-optimist-manifesto/
[12] D. Kahneman, Thinking, Fast and Slow, 1st ed. Farrar, Straus and Giroux, 2011.
[13] Y. Bengio, “The Consciousness Prior,” arXiv:1709.08568 [cs, stat], Sep. 2017, Accessed: Jun. 11, 2019. [Online]. Available: http://arxiv.org/abs/1709.08568
[14] P. Ball, How Life Works: A User’s Guide to the New Biology. University of Chicago Press, 2023.
[15] K. J. Mitchell, Free Agents: How Evolution Gave Us Free Will. Princeton University Press, 2023.
[16] M. Assran et al., “V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning,” Jun. 11, 2025, arXiv: arXiv:2506.09985. doi: 10.48550/arXiv.2506.09985.
[17] Y. Bengio et al., “Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?,” Feb. 24, 2025, arXiv: arXiv:2502.15657. doi: 10.48550/arXiv.2502.15657.
[18] K. Friston, L. Da Costa, D. Hafner, C. Hesp, and T. Parr, “Sophisticated Inference,” Neural Computation, vol. 33, no. 3, pp. 713–763, Mar. 2021, doi: 10.1162/neco_a_01351.
[19] A. Safron, “The Radically Embodied Conscious Cybernetic Bayesian Brain: From Free Energy to Free Will and Back Again,” Entropy, vol. 23, no. 6, Art. no. 6, Jun. 2021, doi: 10.3390/e23060783.
[20] O. Çatal, T. Verbelen, T. Van de Maele, B. Dhoedt, and A. Safron, “Robot navigation as hierarchical active inference,” Neural Networks, vol. 142, pp. 192–204, Oct. 2021, doi: 10.1016/j.neunet.2021.05.010.
[21] A. Safron, O. Çatal, and T. Verbelen, “Generalized Simultaneous Localization and Mapping (G-SLAM) as unification framework for natural and artificial intelligences: towards reverse engineering the hippocampal/entorhinal system and principles of high-level cognition,” Oct. 01, 2021, PsyArXiv. doi: 10.31234/osf.io/tdw82.
[22] M. Tegmark and S. Omohundro, “Provably safe systems: the only path to controllable AGI,” Sep. 05, 2023, arXiv: arXiv:2309.01933. doi: 10.48550/arXiv.2309.01933.
[23] D. Hofstadter, “Is there an ‘I’ in AI?,” 2023. [Online]. Available: https://berryvilleiml.com/wp-content/uploads/Is-there-an-%E2%80%9CI%E2%80%9D-in-AI-.pdf
[24] J. Schrittwieser et al., “Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model,” Nature, vol. 588, no. 7839, pp. 604–609, Dec. 2020, doi: 10.1038/s41586-020-03051-4.
[25] D. Hafner, J. Pasukonis, J. Ba, and T. Lillicrap, “Mastering Diverse Domains through World Models,” Apr. 17, 2024, arXiv: arXiv:2301.04104. doi: 10.48550/arXiv.2301.04104.
[26] A. Dahmani, A. Lidayan, and A. Gopnik, “Empowerment and Causal Learning,” presented at the Intrinsically-Motivated and Open-Ended Learning Workshop @NeurIPS2024, Dec. 2024. Accessed: Jan. 23, 2025. [Online]. Available: https://openreview.net/forum?id=clpsK2u5UN&referrer=%5Bthe%20profile%20of%20Aly%20Lidayan%5D(%2Fprofile%3Fid%3D~Aly_Lidayan1)
[27] V. Thomas et al., “Independently controllable factors,” arXiv preprint arXiv:1708.01289, 2017.
[28] J. Pearl and D. Mackenzie, The Book of Why: The New Science of Cause and Effect. Basic Books, 2018.
[29] A. Goyal et al., “Coordination Among Neural Modules Through a Shared Global Workspace,” Mar. 22, 2022, arXiv: arXiv:2103.01197. doi: 10.48550/arXiv.2103.01197.
[30] A. Safron, “Integrated World Modeling Theory (IWMT) Expanded: Implications for Theories of Consciousness and Artificial Intelligence,” Jun. 21, 2021, PsyArXiv. doi: 10.31234/osf.io/rm5b2.
[31] L. Gao et al., “Scaling and evaluating sparse autoencoders,” Jun. 06, 2024, arXiv: arXiv:2406.04093. doi: 10.48550/arXiv.2406.04093.