Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPythonGet complete instructions for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python 3.6, the second edition of this handson guide is packed with practical case studies that show you how to solve a broad set of data analysis problems effectively. You’ll learn the latest versions of pandas, NumPy, IPython, and Jupyter in the process. Written by Wes McKinney, the creator of the Python pandas project, this book is a practical, modern introduction to data science tools in Python. It’s ideal for analysts new to Python and for Python programmers new to data science and scientific computing. Data files and related material are available on GitHub.

What people are saying  Write a review
We haven't found any reviews in the usual places.
Contents
Chapter 1 Preliminaries  1 
Chapter 2 Python Language Basics IPython and Jupyter Notebooks  15 
Chapter 3 Builtin Data Structures Functions and Files  51 
Arrays and Vectorized Computation  87 
Chapter 5 Getting Started with pandas  125 
Chapter 6 Data Loading Storage and File Formats  169 
Chapter 7 Data Cleaning and Preparation  195 
Join Combine and Reshape  225 
Chapter 11 Time Series  323 
Chapter 12 Advanced pandas  369 
Chapter 13 Introduction to Modeling Libraries in Python  389 
Chapter 14 Data Analysis Examples  409 
Appendix A Advanced NumPy  455 
Appendix B More on the IPython System  489 
511  
About the Author  529 
Chapter 9 Plotting and Visualization  257 
Chapter 10 Data Aggregation and Group Operations  293 
Other editions  View all
Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython Wes McKinney Limited preview  2017 
Python for Data Analysis: Data Wrangling with Pandas, Numpy, and Ipython Wes McKinney No preview available  2017 
Common terms and phrases
aggregate apply arguments axis boolean array builtin bytes chapter Colorado column names command compute concatenation containing convert create data analysis DataFrame DataFrame objects DataFrame’s dataset datetime debugger default dict docstring dtype elements example execute False Figure float64 format Freq frequency function groupby HDF5 hierarchical index import indicating input integer interactive iterator JSON Jupyter notebook key1 key2 keyword labels lambda language list comprehension loop machine learning matplotlib method MetroNorth Railroad missing data mmap module multiple NaN NaN NaN ndarray Nevada nonnull object np.nan NumPy arrays obj3 Ohio open source operations options output pandas pass pivot table plot programming Python language Python objects regular expression resample result rows scalar scikitlearn sequence Series slice smoker sort statistics statsmodels string subplots Table timestamp tip_pct True tuple ufuncs Unicode users values variable Windows