Cognitive ModelingLarge Language ModelsHuman-centered AI
About
Computational approaches to understanding human cognition and improving AI systems.
I am a Ph.D. student in the Psychology program at the Georgia Institute of Technology, specializing in computational cognitive sciences. Prior to joining Georgia Tech, I earned an M.A. degree from the University of Arizona. I spent my entire master's and Ph.D. career under the supervision of Dr. Robert Wilson. Before that, I spent three years at Peking University as a full-time research assistant in CBCS. I earned my undergraduate degree in Human Resource Management from Southwestern University of Finance and Economics in China. Currently, I am visiting Tom Griffiths' CoCoSci Lab at Princeton University.
My research interests broadly lie at the intersection of cognitive sciences and artificial intelligence. My thesis focuses on using Large Language Models (LLMs) to understand how humans think in decision-making and learning using the think-aloud protocol. I employ a variety of computational tools and tasks to explore this exciting topic. Additionally, I am curious about how AI models (e.g., LLMs, agents) behave and think, as well as how human experiences and data can be leveraged to enhance AI models and research. This includes human-AI interactions, AI interpretability, and even artificial general intelligence (AGI).
Beyond my research, I contribute to the broader research community. I co-founded MindRL-Hub, a community that facilitates the research and application of reinforcement learning in psychology and neuroscience. I am dedicated to promoting interdisciplinary research and collaboration, as well as fostering connections among early-career researchers.
Understanding Human Decision-Making and Learning from Think-Aloud Data with Large Language Models
The think-aloud protocol asks participants to verbalize their thoughts while they perform psychological tasks. Traditional work has mostly relied on behavioral outputs (often button presses) to infer latent cognitive processes. In many cases, candidate cognitive models are proposed and tested by researchers, which can limit the hypothesis space and introduce bias. By directly analyzing participants' verbal reports, we gain a richer and more direct view of cognition during task performance. However, most prior think-aloud research depends on manual coding by experts, which is labor-intensive, subjective, and difficult to scale.
Recent advances in LLMs make it possible to revisit this classic protocol with stronger computational tools. LLMs can help quantify, interpret, and even predict subsequent behavior from think-aloud language. Our work evaluates when and how these models can be used reliably, with the goal of building a more systematic and scalable framework for studying human thought processes.
Representative publications:
Xie, H., Xiong, H., & Wilson, R. C. (2023). Text2Decision: Decoding Latent Variables in Risky Decision Making from Think Aloud Text. NeurIPS 2023 AI for Science Workshop.
Xie, H., Xiong, H., & Wilson, R. C. (2024). From Strategic Narratives to Code-Like Cognitive Models: An LLM-Based Approach in A Sorting Task. First Conference on Language Modeling (COLM).
Zhang, Z.*, Xie, H.*, Baker, T., Peters, M., & Wilson, R. C. (2025). Linking strategies to think aloud in a stochastic learning task. In Proceedings of the Annual Meeting of the Cognitive Science Society.
Reverse Engineering Human Thoughts
Human thought is central to intelligence, yet it is difficult to define, measure, and model. A core challenge in both cognitive science and AI is to characterize thought processes across tasks, identify shared principles, and generalize those principles to make useful predictions. The difficulty is that thoughts are often implicit, while language is diverse and context-dependent. As a result, verbal reports are informative but still incomplete reflections of internal cognition.
Instead of focusing only on the forward direction (how thoughts generate behavior), this project emphasizes inverse inference: given observed behavior and related measurements, can we reconstruct plausible underlying thoughts? The broader goal is to build a stronger, more general bridge between behavior and cognition. This direction also supports a human-centered understanding of machine reasoning. If the computations of complex systems (e.g., AlphaGo-like models) can be approximated by human-trained explanatory models, we may be able to describe model reasoning in natural language that is useful for teaching, interpretation, and collaboration.
This project is currently in development and will be my primary research focus at Princeton University.
Representative publications:
Zhu, J.-Q.*, Xie, H.*, Arumugam, D., Wilson, R. C., & Griffiths, T. L. (2025). Using reinforcement learning to train large language models to explain human decisions. arXiv preprint arXiv:2505.11614.
Understanding and Improving Artificial Intelligence Through Human Insights
This project examines AI through concepts from psychology and neuroscience. By comparing strengths and weaknesses of AI and human intelligence, we can design models that are both more capable and more interpretable. Beyond technical performance, I am interested in societal value: systems that support human decision-making, education, and collaboration. I also explore how people can learn from advanced AI models when we build the right frameworks to analyze and communicate their internal computations.
Representative publications:
Pan, L.*, Xie, H.*†, & Wilson, R. C. (2025). Large Language Models Think Too Fast To Explore Effectively. arXiv preprint arXiv:2501.18009. NeurIPS 2025 Poster.
Accelerating Scientific Discoveries in Cognitive Science
The human mind is deeply complex. Although thoughts, emotions, and actions are part of everyday experience, formally describing and predicting cognition remains a major scientific challenge. Many cognitive theories are grounded in human intuition and then tested through experiments and computational models. These approaches are powerful, but they can remain constrained by the original hypothesis space.
In the AI era, there is an opportunity to rethink discovery pipelines in cognitive science and psychology. LLMs bring broad knowledge and strong inductive biases, and modern reasoning models can perform at levels that sometimes rival expert intuition. A central question for my work is whether we can build AI-assisted workflows that help discover new behavioral phenomena, generate computational models, and propose testable theories while reducing avoidable human bias. I view this as complementary to, not a replacement for, careful empirical research.
Representative publications:
Xie, H., Xiong, H., & Wilson, R. C. (2023). Text2Decision: Decoding Latent Variables in Risky Decision Making from Think Aloud Text. NeurIPS 2023 AI for Science Workshop.
Xie, H., Xiong, H., & Wilson, R. C. (2024). From Strategic Narratives to Code-Like Cognitive Models: An LLM-Based Approach in A Sorting Task. First Conference on Language Modeling (COLM).
Zhu, J.-Q.*, Xie, H.*, Arumugam, D., Wilson, R. C., & Griffiths, T. L. (2025). Using reinforcement learning to train large language models to explain human decisions. arXiv preprint arXiv:2505.11614. ICLR 2026.
Xie, H.*, & Zhu, J*. (2025, July 12). Centaur May Have Learned a Shortcut that Explains Away Psychological Tasks. https://doi.org/10.31234/osf.io/u7z4t_v1 (submitted).
Publications
* Denotes equal contribution, † Denotes Correspondence, Underscore denotes mentee. Use topic filters to navigate.
2025
JournalDecisionSocial
Qiu, S., Tang, Y., Yu, H., Xie, H., Dreher, J. C., Hu, Y., & Zhou, X. (2025). Toward a computational understanding of bribe-taking behavior. Annals of the New York Academy of Sciences.
ConferenceLLMDecision
Zhu, J.-Q.*, Xie, H.*, Arumugam, D., Wilson, R. C., & Griffiths, T. L. (2025). Using reinforcement learning to train large language models to explain human decisions. arXiv preprint arXiv:2505.11614. ICLR 2026.
ConferenceLLMDecision
Pan, L.*, Xie, H.*†, & Wilson, R. C. (2025). Large Language Models Think Too Fast To Explore Effectively. arXiv preprint arXiv:2501.18009. NeurIPS 2025 Poster.
ConferenceLLMAI
Xie, H.†, Zhu, J. Q., Xiong, H. D., Wilson, R., & Griffiths, T. (2025). Reasoning Across Minds and Machines. In Proceedings of the Annual Meeting of the Cognitive Science Society (Vol. 47).
ConferenceThink-AloudLearning
Zhang, Z.*, Xie, H.*, Baker, T., Peters, M., & Wilson, R. C. (2025). Linking strategies to think aloud in a stochastic learning task. In Proceedings of the Annual Meeting of the Cognitive Science Society.
PreprintThink-AloudLLM
Xie, H.*, & Zhu, J*. (2025, July 12). Centaur May Have Learned a Shortcut that Explains Away Psychological Tasks. https://doi.org/10.31234/osf.io/u7z4t_v1 (submitted).
Fang, Z., Zhao, M., Xu, T., Li, Y., Xie, H., Quan, P., ... & Zhang, R. Y. (2024). Individuals with anxiety and depression use atypical decision strategies in an uncertain world. eLife, 13.
ConferenceThink-AloudLLM
Xie, H., Xiong, H., & Wilson, R. C. (2024). From Strategic Narratives to Code-Like Cognitive Models: An LLM-Based Approach in A Sorting Task. First Conference on Language Modeling (COLM).
ConferenceThink-AloudDecision
Xie, H., Xiong, H., & Wilson, R. C. (2024). Evaluating Predictive Performance and Learning Efficiency of Large Language Models with Think Aloud in Risky Decision Making. Computational Cognitive Neuroscience (CCN), MIT.
2023
JournalAI
Xie, H. (2023). The promising future of cognitive science and artificial intelligence. Nat Rev Psychology.
ConferenceThink-AloudDecision
Xie, H., Xiong, H., & Wilson, R. C. (2023). Text2Decision: Decoding Latent Variables in Risky Decision Making from Think Aloud Text. NeurIPS 2023 AI for Science Workshop.
ConferenceThink-AloudLLM
Xie, H., Xiong, H., & Wilson, R. C. (2023). Computational introspection: Can large language models reveal cognitive algorithms from human language? Poster session presented at the 5th Chinese Computational and Cognitive Neuroscience Conference, Beijing, China.
2022
ConferenceDecisionLearning
Guo, Y., Song, S., Xie, H., Gao, X., & Zhang, J. (2022, February). ARIMA and RNN for Selection Sequences Prediction in Iowa Gambling Task. In 2022 2nd International Conference on Artificial Intelligence and Signal Processing (AISP) (pp. 1-6). IEEE.
2020
ConferenceSocialLearning
Song, S*., Xie, H.*., Speekenbrink, M., Zhang, J., Gao, X., & Zhou, X. (2020, October). The computational basis of individuals' learning under uncertainty in groups with collective goals. Oral presentation at the Society for Neuroeconomics, Vancouver, Canada.
Blog
Essays and notes at the intersection of cognitive science and AI.
Performance Scales More Easily Than Insight
Scaling and its limits in computational cognitive science
How large behavioral datasets and powerful AI models can rapidly improve predictive performance while scientific understanding lags behind — and why data and knowledge bottlenecks matter for the future of cognitive science.