Abstract
The recent excitement around generative models has sparked a wave of proposals suggesting the replacement of human participation and labor in research and development--e.g., through surveys, experiments, and interviews--with synthetic research data generated by large language models (LLMs). We conducted interviews with 19 qualitative researchers to understand their perspectives on this paradigm shift. Initially skeptical, researchers were surprised to see similar narratives emerge in the LLM-generated data when using the interview probe. However, over several conversational turns, they went on to identify fundamental limitations, such as how LLMs foreclose participants' consent and agency, produce responses lacking in palpability and contextual depth, and risk delegitimizing qualitative research methods. We argue that the use of LLMs as proxies for participants enacts the surrogate effect, raising ethical and epistemological concerns that extend beyond the technical limitations of current models to the core of whether LLMs fit within qualitative ways of knowing.
Abstract (translated)
近年来,关于生成模型的兴奋引起了关于在研究和开发中替换人类参与和劳动的一波建议,例如通过调查、实验和访谈等方式生成大型语言模型(LLMs)生成的合成研究数据。我们对19位定性研究人员进行了采访,以了解他们对这种范式转移的观点。一开始持怀疑态度的研究人员惊讶地发现,当使用访谈探针时,LLM生成的数据中出现了类似于的故事。然而,在接下来的对话中,他们逐渐指出了LLM的局限性,例如LLMs如何扼杀参与者的同意和自主性,产生缺乏可感知性和上下文深度的反应,以及可能使定性研究方法失去信誉。我们认为,将LLM用作参与者的代理的做法实施了一种替代效应,引发了超越现有技术限制的伦理和认识论问题,涉及到LLM是否适合作为定性方法的认识的核心。
URL
https://arxiv.org/abs/2409.19430