Multistage BiCross Encoder: Team GATE Entry for MLIA Multilingual Semantic Search Task 2

2021-01-08 13:59:26

Iknoor Singh, Carolina Scarton, Kalina Bontcheva

arXiv_AI

arXiv_AI Transformer

Abstract
Abstract (translated)
URL
PDF

Abstract

The Coronavirus (COVID-19) pandemic has led to a rapidly growing `infodemic' online. Thus, the accurate retrieval of reliable relevant data from millions of documents about COVID-19 has become urgently needed for the general public as well as for other stakeholders. The COVID-19 Multilingual Information Access (MLIA) initiative is a joint effort to ameliorate exchange of COVID-19 related information by developing applications and services through research and community participation. In this work, we present a search system called Multistage BiCross Encoder, developed by team GATE for the MLIA task 2 Multilingual Semantic Search. Multistage BiCross-Encoder is a sequential three stage pipeline which uses the Okapi BM25 algorithm and a transformer based bi-encoder and cross-encoder to effectively rank the documents with respect to the query. The results of round 1 show that our models achieve state-of-the-art performance for all ranking metrics for both monolingual and bilingual runs.

Abstract (translated)

URL

https://arxiv.org/abs/2101.03013

PDF

https://arxiv.org/pdf/2101.03013.pdf