Course Overview
This course provides an overview of Apache Spark, a unified analytics engine for large-scale data processing. It covers the basics of Hadoop and its limitations, and how Apache Spark enhances the performance of data processing. The course also explores the components of Apache Spark, its uses, and the added value that Intel offers to Spark. Students will learn how to distinguish between Hadoop and Spark, differentiate between Spark and other processing engines, and describe how Intel adds value and improves efficiency in Spark. The course includes topics such as data integration, stream processing, machine learning, and interactive analytics, and provides an introduction to Intel Distributed Deep Learning (Intel BigDL) and Spark tuning.