Mastering HPCC Systems: Fundamentals of ETL Processing

Front Cover
Richard Taylor, Jan 16, 2024 - Computers - 164 pages

HPCC Systems is an Open Source Big Data supercomputing platform that is an alternative to the Hadoop and Spark worlds. The Mastering HPCC Systems series introduces the HPCC Systems platform to anyone interested in evaluating it for use on their own big data projects. It also expands the ECL programming knowledge of anyone already working with the platform.

This Fundamentals of ETL Processing volume provides an introduction to the ECL language through hands-on working through the standard data ingest process common to all Big Data projects. It starts with acquiring data and importing it into the HPCC Systems platform. It then takes you through data exploration, cleaning, and standardization processes. It ends by using that transformed data to create a data product ready for delivery to end-users.

 

Common terms and phrases

About the author (2024)

Richard Taylor has spent his corporate years writing, editing, and publishing books on computer programming languages, covering the range from PC application development to big data, massively parallel processing supercomputers.

Bibliographic information