## A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate DataThis book provides a solution to the ecological inference problem, which has plagued users of statistical methods for over seventy-five years: How can researchers reliably infer individual-level behavior from aggregate (ecological) data? In political science, this question arises when individual-level surveys are unavailable (for instance, local or comparative electoral politics), unreliable (racial politics), insufficient (political geography), or infeasible (political history). This ecological inference problem also confronts researchers in numerous areas of major significance in public policy, and other academic disciplines, ranging from epidemiology and marketing to sociology and quantitative history. Although many have attempted to make such cross-level inferences, scholars agree that all existing methods yield very inaccurate conclusions about the world. In this volume, Gary King lays out a unique--and reliable--solution to this venerable problem. King begins with a qualitative overview, readable even by those without a statistical background. He then unifies the apparently diverse findings in the methodological literature, so that only one aggregation problem remains to be solved. He then presents his solution, as well as empirical evaluations of the solution that include over 16,000 comparisons of his estimates from real aggregate data to the known individual-level answer. The method works in practice. King's solution to the ecological inference problem will enable empirical researchers to investigate substantive questions that have heretofore proved unanswerable, and move forward fields of inquiry in which progress has been stifled by this problem. |

### What people are saying - Write a review

### Contents

Qualitative Overview | 3 |

11 The Necessity of Ecological Inferences | 7 |

12 The Problem | 12 |

13 The Solution | 17 |

14 The Evidence | 22 |

15 The Method | 26 |

Formal Statement of the Problem | 28 |

Catalog of Problems to Fix | 35 |

924 Ex Post Diagnostics | 183 |

93 Avoiding Distributional Problems | 184 |

933 Parametric Approaches | 185 |

932 A Nonparametric Approach | 191 |

Verification | 197 |

A Typical Application Described in Detail Voter Registration by Race | 199 |

102 Likelihood Estimation | 200 |

103 Computing Quantities of Interest | 206 |

Aggregation Problems | 37 |

32 The Indeterminacy Problem | 39 |

33 The Grouping Problem | 46 |

34 Equivalence of the Grouping and Indeterminacy Problems | 53 |

35 A Concluding Definition | 54 |

NonAggregation Problems | 56 |

42 Applying Goodmans Regression in 2 x 3 Tables | 68 |

43 Double Regression Problems | 71 |

44 Concluding Remarks | 73 |

The Proposed Solution | 75 |

The Data Generalizing the Method of Bounds | 77 |

No Uncertainly | 78 |

Upper and Lower Bounds | 79 |

522 DistrictLevel Quantities of Interest | 83 |

53 An Easy Visual Method for Computing Bounds | 85 |

The Model | 91 |

61 The Basic Model | 92 |

62 Model Interpretation | 94 |

621 Observable Implications of Model Parameters | 96 |

622 Parameterizing the Truncated Bivariate Normal | 102 |

623 Computing 2p Parameters from Only p Observations | 106 |

624 Connections to the Statistics of Medical and Seismic Imaging | 112 |

625 Would a Model of IndividualLevel Choices Help? | 119 |

Preliminary Estimation | 123 |

71 A Visual Introduction | 124 |

72 The Likelihood Function | 132 |

73 Parameterizations | 135 |

74 Optional Priors | 138 |

75 Summarizing Information about Estimated Parameters | 139 |

Calculating Quantities of Interest | 141 |

811 Definitions and Examples | 142 |

812 Simulation for Ecological Inference | 144 |

82 PrecinctLevel Quantities | 145 |

83 DistrictLevel Quantities | 149 |

84 Quantities of Interest from Larger Tables | 151 |

842 An Approach Related to Double Regression | 153 |

85 Other Quantities of Interest | 156 |

Model Extensions | 158 |

971 Aggregation Bias | 159 |

912 Incorrect Distributional Assumptions | 161 |

913 Spatial Dependence | 164 |

92 Avoiding Aggregation Bias | 168 |

921 Using External Information | 169 |

X as a Covariate | 174 |

923 Tradeoffs and Priors for the Extended Model | 179 |

1031 Aggregate | 207 |

1032 County Level | 209 |

1033 Other Quantities of Interest | 215 |

Robustness to Aggregation Bias Poverty Status by Sex | 217 |

112 Verifying the Existence of Aggregation Bias | 218 |

113 Fitting the Data | 220 |

114 Empirical Results | 222 |

Estimation without Information Black Registration in Kentucky | 226 |

122 Data Problems | 227 |

123 Fitting the Data | 228 |

124 Empirical Results | 232 |

Classic Ecological Inferences | 235 |

1312 Estimates | 238 |

132 Black Literacy in 1910 | 241 |

Generalizations and Concluding Suggestions | 247 |

NonEcological Aggregation Problems | 249 |

1411 The Problem with the Problem | 250 |

1412 Ecological Inference as a Solution to the Modifiable Areal Unit Problem | 252 |

142 The Statistical Problem of Combining Survey and Aggregate Data | 255 |

143 The Econometric Problem of Aggregation Continuous Variables | 258 |

144 Concluding Remarks on Related Aggregation Research | 262 |

Ecological Inference in Larger Tables | 263 |

151 An Intuitive Approach | 264 |

152 Notation for a General Approach | 267 |

153 Generalized Bounds | 269 |

154 The Statistical Model | 271 |

155 Distributional Implication | 273 |

156 Calculating the Quantities of Interest | 276 |

A Concluding Checklist | 277 |

Appendices | 293 |

Proof That All Discrepancies Are Equivalent | 295 |

Parameter Bounds | 301 |

𝜷s and 𝜽s | 302 |

𝜆s | 303 |

Conditional Posterior Distribution | 304 |

C1 Using Bayes Theorem | 305 |

C2 Using Properties of Normal Distributions | 306 |

The Likelihood Function | 307 |

The Details of Nonparametric Estimation | 309 |

Computational Issues | 311 |

Glossary of Symbols | 313 |

317 | |

337 | |

### Other editions - View all

A Solution to the Ecological Inference Problem: Reconstructing Individual ... Gary King Limited preview - 2013 |