DIPS at CheckThat! 2021: Verified Claim Retrieval


This paper outlines the approach of team DIPS towards solving the CheckThat! 2021 Lab Task 2 - a semantic textual similarity problem for retrieving previously fact-checked claims. The task is divided into two subtasks, where the goal is to rank a set of already fact-checked claims based on their relevance to an input claim. The main difference between the two is the data sources, i.e., Task 2A’s claims are tweets, while Task 2B - debates and speeches. For solving the task, we combine variety of algorithms - BM25, S-BERT, a custom classifier, and RankSVM into a claim retrieval system. Moreover, we show that data preprocessing is critical for such tasks and can lead to significant improvements in MRR and MAP. We have participated in the English edition of both subtasks and our system was ranked third in Task 2A, and first in Task 2B.

In Proceedings of the CLEF 2021 - Conference and Labs of the Evaluation Forum
Momchil Hardalov
Momchil Hardalov
Applied Scientist

My research interests include natural langauge processing, few-shot, semi-supervised and multilingual learning. I have a strong software engineering background as a Software and Machine Learning Engineer.