In Search of Credible News

Momchil Hardalov, Ivan Koychev, Preslav Nakov

August 2016

Abstract

We study the problem of finding fake online news. This is an important problem as news of questionable credibility have recently been proliferating in social media at an alarming scale. As this is an understudied problem, especially for languages other than English, we first collect and release to the research community three new balanced credible vs. fake news datasets derived from four online sources. We then propose a language-independent approach for automatically distinguishing credible from fake news, based on a rich feature set. In particular, we use linguistic (n-gram), credibility-related (capitalization, punctuation, pronoun use, sentiment polarity), and semantic (embeddings and DBPedia data) features. Our experiments on three different testsets show that our model can distinguish credible from fake news with very high accuracy.

Type

Conference paper

Publication

In Proceedings of the 17th International Conference on Artificial Intelligence: Methodology, Systems, and Applications

Momchil Hardalov

Applied Scientist

My research interests include natural langauge processing, few-shot, semi-supervised and multilingual learning. I have a strong software engineering background as a Software and Machine Learning Engineer.