Data Science from Scratch: First Principles with Python

Front Cover
"O'Reilly Media, Inc.", Apr 14, 2015 - Computers - 330 pages

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch.

If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out.

  • Get a crash course in Python
  • Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science
  • Collect, explore, clean, munge, and manipulate data
  • Dive into the fundamentals of machine learning
  • Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering
  • Explore recommender systems, natural language processing, network analysis, MapReduce, and databases
 

What people are saying - Write a review

Helpful

User Review  - OstkUser894211 - Overstock.com

Wellwritten easy to follow. Book arrived very slightly damaged through rough handling during shipping. Read full review

Contents

Chapter 1 Introduction
1
Chapter 2 A Crash Course in Python
15
Chapter 3 Visualizing Data
37
Chapter 4 Linear Algebra
49
Chapter 5 Statistics
57
Chapter 6 Probability
69
Chapter 7 Hypothesis and Inference
81
Chapter 8 Gradient Descent
93
Chapter 15 Multiple Regression
179
Chapter 16 Logistic Regression
189
Chapter 17 Decision Trees
201
Chapter 18 Neural Networks
213
Chapter 19 Clustering
225
Chapter 20 Natural Language Processing
239
Chapter 21 Network Analysis
255
Chapter 22 Recommender Systems
267

Chapter 9 Getting Data
103
Chapter 10 Working with Data
121
Chapter 11 Machine Learning
141
Chapter 12 kNearest Neighbors
151
Chapter 13 Naive Bayes
165
Chapter 14 Simple Linear Regression
173
Chapter 23 Databases and SQL
275
Chapter 24 MapReduce
289
Chapter 25 Go Forth and Do Data Science
299
Index
305
About the Author
312
Copyright

Other editions - View all

Common terms and phrases

About the author (2015)

Joel Grus is a software engineer at Google. Before that he worked as a data scientist at multiple startups. He lives in Seattle, where he regularly attends data science happy hours. He blogs infrequently at joelgrus.com.

Bibliographic information