Demystifying AI for Behavior Analysts: Navigating Ethical Adoption and Algorithmic Bias (Citations)
- Anthropic. (2026, April 7). Claude Mythos Preview system card. Anthropic. https://anthropic.com
- Bai, Y., Jones, A., Ndousse, K., Askell, A., Chen, A., DasSarma, N., Drain, D., Fort, S., Ganguli, D., Henighan, T., Joseph, N., Kadavath, S., Kernion, J., Conerly, T., El-Showk, S., Elhage, N., Hatfield-Dodds, Z., Hernandez, D., Hume, T., . . . Kaplan, J. (2022). Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv. https://doi.org/10.48550/arXiv.2204.05862
- Batista, R. M., & Griffiths, T. L. (2026). A rational analysis of the effects of sycophantic AI. arXiv. https://doi.org/10.48550/arXiv.2602.14270
- Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., . . . Amodei, D. (2020). Language models are few-shot learners. arXiv. https://doi.org/10.48550/arXiv.2005.14165
- Chandra, K., Kleiman-Weiner, M., Ragan-Kelley, J., & Tenenbaum, J. B. (2026). Sycophantic chatbots cause delusional spiraling, even in ideal Bayesians. arXiv. https://doi.org/10.48550/arXiv.2602.19141
- Christiano, P., Leike, J., Brown, T. B., Martic, M., Legg, S., & Amodei, D. (2023). Deep reinforcement learning from human preferences. arXiv. https://doi.org/10.48550/arXiv.1706.03741
- Cox, D. J. (2025). Ethical behavior analysis in the age of artificial intelligence (AI): The importance of understanding model building while formal AI literacy curricula are developed. Perspectives on Behavior Science. Advance online publication. https://doi.org/10.1007/s40614-025-00459-z
- Cox, D. J., & Jennings, A. M. (2024). The promises and possibilities of artificial intelligence in the delivery of behavior analytic services. Behavior Analysis in Practice, 17, 123–136. https://doi.org/10.1007/s40617-023-00864-3
- Cox, D. J., & Sosine, J. (2025). A data-driven, algorithmic approach to recommending hours of ABA for individuals with ASD. Behavioral Interventions, 40, e70014. https://doi.org/10.1002/bin.70014
- Cox, D. J., Weil, L., Sosine, J., Jennings, A. M., & Santos, C. (2025). Getting more from your IOA data: Alternative measures to total, occurrence, and non-occurrence agreement. Behavioral Interventions, e70031. https://doi.org/10.1002/bin.70031
- Crossman, E. K. (1985). The kiss and the promise: A review of Hubert L. Dreyfus' What computers can't do: The limits of artificial intelligence. Journal of the Experimental Analysis of Behavior, 44(2), 271–277. https://doi.org/10.1901/jeab.1985.44-271
- Dufour, M.-M., Lanovaz, M. J., & Cardinal, P. (2020). Artificial intelligence for the measurement of vocal stereotypy. Journal of the Experimental Analysis of Behavior, 114(3), 368–380. https://doi.org/10.1002/jeab.636
- Guo, Y., Guo, M., Su, J., Yang, Z., Zhu, M., Li, H., Qiu, M., & Liu, S. S. (2024). Bias in large language models: Origin, evaluation, and mitigation. arXiv. https://doi.org/10.48550/arXiv.2411.10915
- Jennings, A. M., & Cox, D. J. (2023). Starting the conversation around the ethical use of artificial intelligence in applied behavior analysis. Behavior Analysis in Practice, 17(1), 107–122. https://doi.org/10.1007/s40617-023-00868-z
- Jošt, G., Taneski, V., & Karakatič, S. (2024). The impact of large language models on programming education and student learning outcomes. Applied Sciences, 14(10), 4115. https://doi.org/10.3390/app14104115
- Kosmyna, N., Hauptmann, E., Yuan, Y. T., Situ, J., Liao, X.-H., Beresnitzky, A. V., Braunstein, I., & Maes, P. (2025). Your brain on ChatGPT: Accumulation of cognitive debt when using an AI assistant for essay writing task. arXiv. https://doi.org/10.48550/arXiv.2506.08872
- Lanovaz, M. J. (2022). Some characteristics and arguments in favor of a science of machine behavior analysis. Perspectives on Behavior Science, 45(2), 399–419. https://doi.org/10.1007/s40614-022-00332-3
- Liu, S., Wright, A. P., Patterson, B. L., Wanderer, J. P., Turer, R. W., Nelson, S. D., McCoy, A. B., Sittig, D. F., & Wright, A. (2023). Using AI-generated suggestions from ChatGPT to optimize clinical decision support. Journal of the American Medical Informatics Association, 30(7), 1237–1245. https://doi.org/10.1093/jamia/ocad072
- Mahajan, A., Obermeyer, Z., Daneshjou, R., Lester, J., & Powell, D. (2025). Cognitive bias in clinical large language models. npj Digital Medicine, 8, 428. https://doi.org/10.1038/s41746-025-01790-0
- Mohamed, A., Assi, M., & Guizani, M. (2026). The impact of LLM-assistants on software developer productivity: A systematic review and mapping study. arXiv. https://doi.org/10.48550/arXiv.2507.03156
- Morris, C., Jones, S. H., & Oliveira, J. P. (2024). A Practitioner's Guide to Measuring Procedural Fidelity. Behavior analysis in practice, 17(2), 643–655. https://doi.org/10.1007/s40617-024-00910-8
- Mutanga, M. B., Msane, J., Mndaweni, T. N., Hlongwane, B. B., & Ngcobo, N. Z. (2025). Exploring the impact of LLM prompting on students' learning. Trends in Higher Education, 4(3), 31. https://doi.org/10.3390/higheredu4030031
- O'Neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy. Crown.
- Perrigo, B. (2023, January 18). Exclusive: OpenAI used Kenyan workers on less than $2 per hour to make ChatGPT less toxic. TIME. https://time.com/6247678/openai-chatgpt-kenya-workers/
- Poole-Dayan, E., Roy, D., & Kabbara, J. (2025). LLM targeted underperformance disproportionately impacts vulnerable users. arXiv. https://doi.org/10.48550/arXiv.2406.17737
- Raj, M., Berg, J. M., & Seamans, R. (2026). The artificial intelligence disclosure penalty: Humans persistently devalue AI-generated creative writing. Journal of Experimental Psychology: General, 155(4), 896–915. https://doi.org/10.1037/xge0001889
- Sinayev, A., & Courtney, C. (2025). Effectiveness of LLM-based AI assistance for small business productivity. Research Square. https://doi.org/10.21203/rs.3.rs-6481789/v1
- Singhal, K., Azizi, S., Tu, T., Mahdavi, S. S., Wei, J., Chung, H. W., Scales, N., Tanwani, A., Cole-Lewis, H., Pfohl, S., Payne, P., Seneviratne, M., Gamble, P., Kelly, C., Babiker, A., Schärli, N., Chowdhery, A., Mansfield, P., Demner-Fushman, D., . . . Natarajan, V. (2023). Large language models encode clinical knowledge. Nature, 620(7972), 172–180. https://doi.org/10.1038/s41586-023-06291-2
- Stephens, K. R., & Hutchison, W. R. (1992). Behavioral personal digital assistants: The seventh generation of computing. The Analysis of Verbal Behavior, 10, 149–156. https://doi.org/10.1007/BF03392881
- Sun, C., McEwan, A., Boulton, K. A., Demetriou, E. A., Sadozai, A. K., Lampit, A., & Guastella, A. J. (2025). Artificial intelligence for tracking social behaviours and supporting an autism spectrum disorder diagnosis: Systematic review and meta-analysis. EBioMedicine, 120, 105931. https://doi.org/10.1016/j.ebiom.2025.105931
- Tam, T. Y. C., Sivarajkumar, S., Kapoor, S., Stolyar, A. V., Polanska, K., McCarthy, K. R., Osterhoudt, H., Wu, X., Visweswaran, S., Fu, S., Mathur, P., Cacciamani, G. E., Sun, C., Peng, Y., & Wang, Y. (2024). A framework for human evaluation of large language models in healthcare derived from literature review. npj Digital Medicine, 7, 258. https://doi.org/10.1038/s41746-024-01258-7
- Templin, T., Fort, S., Padmanabham, P., Seshadri, P., Rimal, R., Oliva, J., Hassmiller Lich, K., Sylvia, S., & Sinnott-Armstrong, N. (2025). Framework for bias evaluation in large language models in healthcare settings. NPJ digital medicine, 8(1), 414. https://doi.org/10.1038/s41746-025-01786-w
- Turgeon, S., & Lanovaz, M. J. (2020). Tutorial: Applying machine learning in behavioral research. Perspectives on Behavior Science, 43(4), 697–723. https://doi.org/10.1007/s40614-020-00270-y
- Wulf, J., & Meierhofer, J. (2025). The impact of large language models on task automation in manufacturing services. Procedia CIRP, 134, 1089–1094. https://doi.org/10.1016/j.procir.2025.03.071
- Xu, Z., Jain, S., & Kankanhalli, M. (2025). Hallucination is inevitable: An innate limitation of large language models. arXiv. https://doi.org/10.48550/arXiv.2401.11817
- Zack, T., Lehman, E., Suzgun, M., Rodriguez, J. A., Celi, L. A., Gichoya, J., Jurafsky, D., Szolovits, P., Bates, D. W., Abdulnour, R.-E. E., Butte, A. J., & Alsentzer, E. (2024). Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: A model evaluation study. The Lancet Digital Health, 6, e12–e22. https://doi.org/10.1016/S2589-7500(23)00225-X