Data Science on the Google Cloud Platform: Implementing End-to-End Real-Time Data Pipelines: From Ingest to Machine Learning

Front Cover
"O'Reilly Media, Inc.", Dec 12, 2017 - Computers - 404 pages

Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build on top of the Google Cloud Platform (GCP). This hands-on guide shows developers entering the data science field how to implement an end-to-end data pipeline, using statistical and machine learning methods and tools on GCP. Through the course of the book, you’ll work through a sample business decision by employing a variety of data science approaches.

Follow along by implementing these statistical and machine learning solutions in your own project on GCP, and discover how this platform provides a transformative and more collaborative way of doing data science.

You’ll learn how to:

  • Automate and schedule data ingest, using an App Engine application
  • Create and populate a dashboard in Google Data Studio
  • Build a real-time analysis pipeline to carry out streaming analytics
  • Conduct interactive data exploration with Google BigQuery
  • Create a Bayesian model on a Cloud Dataproc cluster
  • Build a logistic regression machine-learning model with Spark
  • Compute time-aggregate features with a Cloud Dataflow pipeline
  • Create a high-performing prediction model with TensorFlow
  • Use your deployed model as a microservice you can access from both batch and real-time pipelines
 

Selected pages

Contents

Section 1
Section 2
Section 3
Section 4
Section 5
Section 6
Section 7
Section 8
Section 17
Section 18
Section 19
Section 20
Section 21
Section 22
Section 23
Section 24

Section 9
Section 10
Section 11
Section 12
Section 13
Section 14
Section 15
Section 16
Section 25
Section 26
Section 27
Section 28
Section 29
Section 30
Section 31

Other editions - View all

Common terms and phrases

About the author (2017)

Valliappa (Lak) Lakshmanan is currently a Tech Lead for Data and Machine Learning Professional Services for Google Cloud. His mission is to democratize machine learning so that it can be done by anyone anywhere using Google's amazing infrastructure, without deep knowledge of statistics or programming or ownership of a lot of hardware. Before Google, he led a team of data scientists at the Climate Corporation and was a Research Scientist at NOAA National Severe Storms Laboratory, working on machine learning applications for severe weather diagnosis and prediction.

Bibliographic information