## Information Geometry and Its ApplicationsThis is the first comprehensive book on information geometry, written by the founder of the field. It begins with an elementary introduction to dualistic geometry and proceeds to a wide range of applications, covering information science, engineering, and neuroscience. It consists of four parts, which on the whole can be read independently. A manifold with a divergence function is first introduced, leading directly to dualistic structure, the heart of information geometry. This part (Part I) can be apprehended without any knowledge of differential geometry. An intuitive explanation of modern differential geometry then follows in Part II, although the book is for the most part understandable without modern differential geometry. Information geometry of statistical inference, including time series analysis and semiparametric estimation (the Neyman–Scott problem), is demonstrated concisely in Part III. Applications addressed in Part IV include hot current topics in machine learning, signal processing, optimization, and neural networks. The book is interdisciplinary, connecting mathematics, information sciences, physics, and neurosciences, inviting readers to a new world of information and geometry. This book is highly recommended to graduate students and researchers who seek new mathematical methods and tools useful in their own fields. |

### What people are saying - Write a review

We haven't found any reviews in the usual places.

### Contents

3 | |

2 Exponential Families and Mixture Families of Probability Distributions | 31 |

3 Invariant Geometry of Manifold of Probability Distributions | 51 |

4 αGeometry Tsallis qEntropy and PositiveDefinite Matrices | 70 |

Part II Introduction to Dual Differential Geometry | 107 |

5 Elements of Differential Geometry | 109 |

6 Dual Affine Connections and Dually Flat Manifold | 131 |

Part III Information Geometry of Statistical Inference | 162 |

Estimating Function and Semiparametric Statistical Model | 191 |

10 Linear Systems and Time Series | 214 |

Part IV Applications of Information Geometry | 228 |

11 Machine Learning | 229 |

12 Natural Gradient Learning and Its Dynamics in Singular Regions | 279 |

13 Signal Processing and Optimization | 315 |

References | 359 |

371 | |

7 Asymptotic Theory of Statistical Inference | 165 |

8 Estimation in the Presence of Hidden Variables | 178 |

### Other editions - View all

### Common terms and phrases

affine connections affine coordinate affine coordinate system algorithm Amari asymptotically basis vectors Bregman divergence calculate called canonical divergence cluster components converges convex function covariance critical region curvature curve decomposable defined denote derived differential geometry dual affine coordinate dual geodesic dually flat manifold dually flat structure e-flat efficient entropy estimating function Euclidean space example exponential family f-divergence Fisher information matrix given Hence higher-order Hyvärinen independent information geometry invariant inverse kernel KL-divergence Let us consider linear log p(x loss function m-projection machine Mathematical minimizer multilayer perceptron Neural neurons obtain optimal orthogonal parallel transport probability density probability distributions problem projection Proof Pythagorean theorem random variable respect Riemannian manifold Riemannian metric satisfies singular solution statistical model stochastic submanifold subspace tangent space tangent vector tensor theory Tijk transformation written