Abstract
Current text generation models are trained using real data which can potentially contain sensitive information, such as confidential patient information and the like. Under certain conditions output of the training data which they have memorised can be triggered, exposing sensitive data. To mitigate against this risk we propose a safer alternative which sees fragmented data in the form of domain-specific short phrases randomly grouped together shared instead of full texts. Thus, text fragments that could re-identify an individual cannot be reproduced by the model in one sequence, giving significant protection against linkage attacks. We fine-tune several state-of-the-art LLMs using meaningful syntactic chunks to explore their utility. In particular, we fine-tune BERT-based models to predict two cardiovascular diagnoses. Our results demonstrate the capacity of LLMs to benefit from the pre-trained knowledge and deliver classification results when fine-tuned with fragmented data comparable to fine-tuning with full training data.
Abstract (translated)
当前的文本生成模型使用真实数据进行训练,这可能包含敏感信息,如机密患者信息等。在某些情况下,它们训练数据的输出可能会触发包含敏感数据的输出,从而暴露敏感信息。为了减轻这种风险,我们提出了一个更安全的选择,即使用领域特定短语(domain-specific short phrases)随机组合而不是完整文本的破碎数据。因此,模型无法通过一个序列复制可能重新识别个人的文本片段,从而对链接攻击具有显著的防护作用。我们使用有意义的语义块对几个最先进的LLM进行微调,以探讨它们的使用价值。特别地,我们微调基于BERT的模型,以预测两个心血管诊断。我们的结果表明,LLM可以利用预训练知识并在与完整训练数据相似的破碎数据上进行微调,从而产生分类结果。
URL
https://arxiv.org/abs/2404.19486