Cross-Lingual Fact Verification: Analyzing LLMs Performance Patterns Across Languages

Abstract

Fact verification has emerged as a critical task in combating misinformation, yet most research remains focused on English-language applications. This paper presents a comprehensive analysis of multilingual fact verification capabilities across three state-of-the-art large language models: Llama 3.1, Qwen 2.5, and Mistral Nemo. We evaluate these models on the X-Fact dataset that includes 25 typologically diverse languages, examining both seen and unseen languages through test and zero-shot evaluation scenarios. Our analysis reveals significant performance disparities based on script systems, with Latin script languages consistently outperforming others. We identify systematic cross-lingual instruction following failures, particularly affecting languages with non-Latin scripts. Surprisingly, some officially supported languages such as Indonesian and Polish achieve better performance than traditionally high-resource languages like German and Spanish, challenging conventional assumptions about resource availability and model performance. The results highlight critical limitations in current multilingual LLMs for the fact verification task and provide insights for developing more inclusive multilingual systems.

Publication
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing
Tatiana Anikina
Tatiana Anikina
PhD Student
Josef van Genabith
Josef van Genabith
Professor at German Research Center for Artificial Intelligence (DFKI)