Abstract
Query expansion with large language models is promising but often relies on hand-crafted prompts, manually chosen exemplars, or a single LLM, making it non-scalable and sensitive to domain shift. We present an automated, domain-adaptive QE framework that builds in-domain exemplar pools by harvesting pseudo-relevant passages using a BM25-MonoT5 pipeline. A training-free cluster-based strategy selects diverse demonstrations, yielding strong and stable in-context QE without supervision. To further exploit model complementarity, we introduce a two-LLM ensemble in which two heterogeneous LLMs independently generate expansions and a refinement LLM consolidates them into one coherent expansion. Across TREC DL20, DBPedia, and SciFact, the refined ensemble delivers consistent and statistically significant gains over BM25, Rocchio, zero-shot, and fixed few-shot baselines. The framework offers a reproducible testbed for exemplar selection and multi-LLM generation, and a practical, label-free solution for real-world QE.
Abstract (translated)
基于大型语言模型的查询扩展显示出巨大的潜力,但通常依赖于手工制作的提示、手动选择的示例或单一的大规模语言模型(LLM),这使得它不具备可扩展性和对领域变化敏感。我们提出了一种自动化且适应领域的查询扩展框架,该框架通过使用BM25-MonoT5管道收集伪相关片段来构建特定领域的实例池。一种无需训练的集群策略选择多样化的演示文稿,从而在无监督的情况下实现强大和稳定的上下文查询扩展效果。为了进一步利用模型互补性,我们引入了一个双LLM集成,在其中两个异构的大规模语言模型分别生成扩展内容,然后通过另一个精炼的LLM将这些扩展合并为一个连贯的整体。 跨TREC DL20、DBPedia以及SciFact数据集,该经过优化的集合框架在BM25、Rocchio、零样本及固定少量样本基准上都获得了显著且一致的表现提升。此框架提供了一个可重现的研究平台用于示例选择和多LLM生成,并为现实世界的查询扩展问题提供了无需标签的实际解决方案。
URL
https://arxiv.org/abs/2602.08917