Applying Benford's Law for Assessing the Validity of Social Science DataBenford's Law is a probability distribution for the likelihood of the leading digit in a set of numbers. This book seeks to improve and systematize the use of Benford's Law in the social sciences to assess the validity of self-reported data. The authors first introduce a new measure of conformity to the Benford distribution that is created using permutation statistical methods and employs the concept of statistical agreement. In a switch from a typical Benford application, this book moves away from using Benford's Law to test whether the data conform to the Benford distribution, to using it to draw conclusions about the validity of the data. The concept of 'Benford validity' is developed, which indicates whether a dataset is valid based on comparisons with the Benford distribution and, in relation to this, diagnostic procedure that assesses the impact of not having Benford validity on data analysis is devised. |
Contents
I | 1 |
Validity and SelfReported Data | 11 |
Benfords Law and Assessing Conformity | 24 |
Measures of Probability | 36 |
Measures of Effect Size | 46 |
A ChanceCorrected Measure of Effect Size | 55 |
Probability Values | 63 |
Conclusion | 72 |
The Workflow of Benford Agreement Analysis | 74 |
Benford Agreement Analysis of the Sea Around | 94 |
94 | 127 |
Assessing the Impacts of Problematic Benford Validity | 169 |
Conclusion | 188 |
References | 196 |
Index | 203 |
Other editions - View all
Common terms and phrases
22 West African acceptable AFG Afghanistan Afghanistan Albania assessing the validity Benford agreement analysis Benford analysis Benford distribution Benford probabilities Figure Benford probability distribution Benford's law Carlo probability value CDC COVID-19 chance-corrected measure Chapter chi-squared Cho and Gaines coefficients correlation COVID-19 data data listed differences effect size Equatorial Guinea exact expected value examine expected frequencies expected probabilities expected value fisheries Frequency probability full dataset Global COVID-19 goodness-of-fit high agreement independent variables indicates leading digit level of agreement likelihood-ratio test listed in Table logarithm measure of agreement measure of effect misreporting Monte Carlo probability Nigrini observed data orders of magnitude Pearson's permutation statistical methods program Benford random samples random-effects random-effects models region reported and unreported reported fish landings sample sizes self-reported data significant digits Simpson's paradox social science data subgroups unacceptable agreement unreported fish-landings data validity of self-reported West African countries


