1. Stevens, S. S. (1935). The operational definition of psychological concepts. Psychological Review, 42(6), 517–527. https://doi.org/10.1037/h0056973
2. Moore, J. (1996). On the relation between behaviorism and cognitive psychology. Journal of Mind and Behavior, 17, 345–368.
3. Zou, A., Phan, L., Chen, S., Campbell, J., Guo, P., Ren, R., Pan, A., Yin, X., Mazeika, M., Dombrowski, A.-K., Goel, S., Li, N., Byun, M. J., Wang, Z., Mallen, A., Basart, S., Koyejo, S., Song, D., Fredrikson, M., … Hendrycks, D. (2023). Representation Engineering: A Top-Down Approach to AI Transparency (arXiv:2310.01405). arXiv. https://doi.org/10.48550/arXiv.2310.01405
4. Bricken, T., Templeton, A., Batson, J., Chen, B., Jermyn, A., Conerly, T., Turner, N., Anil, C., Denison, C., & Askell, A. (2023). Towards monosemanticity: Decomposing language models with dictionary learning. Transformer Circuits Thread, 2.
5. Chen, Y., Liu, T. X., Shan, Y., & Zhong, S. (2023). The emergence of economic rationality of GPT. Proceedings of the National Academy of Sciences, 120(51), e2316205120. https://doi.org/10.1073/pnas.2316205120
6. Horton, J. J. (2023). Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus? (arXiv:2301.07543). arXiv. https://doi.org/10.48550/arXiv.2301.07543
7. Aher, G., Arriaga, R. I., & Kalai, A. T. (2023). Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies (arXiv:2208.10264). arXiv. https://doi.org/10.48550/arXiv.2208.10264
8. Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., Lee, P., Lee, Y. T., Li, Y., Lundberg, S., Nori, H., Palangi, H., Ribeiro, M. T., & Zhang, Y. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4 (arXiv:2303.12712). arXiv. https://doi.org/10.48550/arXiv.2303.12712
9. Hagendorff, T. (2023). Machine Psychology: Investigating Emergent Capabilities and Behavior in Large Language Models Using Psychological Methods (arXiv:2303.13988). arXiv. https://doi.org/10.48550/arXiv.2303.13988
10. Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (n.d.). Whose Opinions Do Language Models Reflect?
11. Dennett, D. C. (2006). Sweet Dreams: Philosophical Obstacles to a Science of Consciousness. Bradford Books.
12. McClelland, J. L., Hill, F., Rudolph, M., Baldridge, J., & Schütze, H. (2020). Placing language in an integrated understanding system: Next steps toward human-level performance in neural language models. Proceedings of the National Academy of Sciences, 117(42), 25966–25974. https://doi.org/10.1073/pnas.1910416117
13. Chaos. (2023). Zhihu. Retrieved from https://www.zhihu.com/question/593496742/answer/2966587547
14. Binz, M., & Schulz, E. (2023). Using cognitive psychology to understand GPT-3. Proceedings of the National Academy of Sciences, 120(6), e2218523120. https://doi.org/10.1073/pnas.2218523120
15. Argyle, L. P., Busby, E. C., Fulda, N., Gubler, J., Rytting, C., & Wingate, D. (2022). Out of One, Many: Using Language Models to Simulate Human Samples. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 819–862. https://doi.org/10.18653/v1/2022.acl-long.60
16. Edmiston, P., & Lupyan, G. (2015). What makes words special? Words as unmotivated cues. Cognition, 143, 93–100. https://doi.org/10.1016/j.cognition.2015.06.008
17. Yildirim, I., & Paul, L. A. (2023). From task structures to world models: What do LLMs know? https://doi.org/10.48550/ARXIV.2310.04276
18. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language Models are Unsupervised Multitask Learners. https://www.semanticscholar.org/paper/Language-Models-are-Unsupervised-Multitask-Learners-Radford-Wu/9405cc0d6169988371b2755e573cc28650d14dfe
19. Miotto, M., Rossberg, N., & Kleinberg, B. (2022). Who is GPT-3? An Exploration of Personality, Values and Demographics (arXiv:2209.14338). arXiv. https://doi.org/10.48550/arXiv.2209.14338
20. Shihadeh, J., Ackerman, M., Troske, A., Lawson, N., & Gonzalez, E. (2022). Brilliance Bias in GPT-3. 2022 IEEE Global Humanitarian Technology Conference (GHTC), 62–69. https://doi.org/10.1109/GHTC55712.2022.9910995
21. Park, P. S., Schoenegger, P., & Zhu, C. (2023). Diminished Diversity-of-Thought in a Standard Large Language Model (arXiv:2302.07267). arXiv. https://doi.org/10.48550/arXiv.2302.07267
22. Argyle, L. P., Busby, E. C., Fulda, N., Gubler, J., Rytting, C., & Wingate, D. (2022). Out of One, Many: Using Language Models to Simulate Human Samples. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 819–862. https://doi.org/10.18653/v1/2022.acl-long.60
23. Xu, C., Chern, S., Chern, E., Zhang, G., Wang, Z., Liu, R., Li, J., Fu, J., & Liu, P. (2023, December 26). Align on the Fly: Adapting Chatbot Behavior to Established Norms. arXiv.Org. https://arxiv.org/abs/2312.15907v1