Veracity of Data: From Truth Discovery Computation Algorithms to Models of Misinformation Dynamics
Morgan & Claypool Publishers, Dec 1, 2015 - Computers - 144 pages
In the Web, a massive amount of user-generated contents are available through various channels (e.g., texts, tweets, Web tables, databases, multimedia-sharing platforms, etc.). Conflicting information, rumors, erroneous and fake contents can be easily spread across multiple sources, making it hard to distinguish between what is true and what is not. This monograph gives an overview of fundamental issues and recent contributions for ascertaining the veracity of data in the era of Big Data. The text is organized into six chapters, focusing on structured data extracted from texts. Chapter One introduces the problem of ascertaining the veracity of data in a multi-source and evolving context. Issues related to information extraction are presented in chapter Two. It is followed by practical techniques for evaluating data source reputation and authoritativeness in Chapter Three, including a review of the main models and Bayesian approaches of trust management. Current truth discovery computation algorithms are presented in details in Chapter Four. The theoretical foundations and various approaches for modeling diffusion phenomenon of misinformation spreading in networked systems is studied in Chapter Five. Finally, truth discovery computation from extracted data in a dynamic context of misinformation propagation raises interesting challenges that are explored in Chapter Six. Supplementary material including source codes, datasets, and slides are offered online. This text is intended for a seminar course at the graduate level. It is also to serve as a useful resource for researchers and practitioners who are interested in the study of fact-checking, truth discovery or rumor spreading.
What people are saying - Write a review
We haven't found any reviews in the usual places.
Other editions - View all
adjacency matrix algorithm approaches capture challenge Chapter clustering coefficient complex networks Conference confidence score connected considered coreference crowdsourced data item Databases defined degree degree distribution depending detection diffusion direct trust Dirichlet distribution distribution Dong dynamics edges eigenvector centrality entity linking estimate example extractor fact-checking false Figure fusion identified information extraction initial iteratively Javier Borge-Holthoefer k-core knowledge base Leskovec matrix misinformation models multiple multiplex mutation named entity named entity recognition nodes Obama observation parameters Plate diagram Prew probabilistic probability problem Proceedings proposed random graph real-world relation extraction reliability rumor scale-free scale-free networks scenarios semantic slot filling small-world networks social networks source quality source trustworthiness sources providing spreaders spreading textual tion trust computation trust management trust metrics truth discovery truth discovery computation truth discovery method truth label users value confidence vertex vote Wang Web of Trust Yamir Moreno