Introduction to Apache Hadoop

Big Data as described by its volume, velocity and veracity, requires usage of specific tools and technology for analysis and analytics. This course will let you unlock the power of Big Data and harness insight from it using Hadoop, one of the top frameworks in the industry. It will introduce distributed file systems and show how to use techniques such as MapReduce to solve business problems using big datasets.

Curriculum

  1. Introduction to Big Data
  2. Introduction to Hadoop ecosystem
  3. HDFS and MapReduce
  4. MapReduce design patterns
  5. Python/R and machine learning overview
  6. Distributed Machine Learning 1
  7. Distributed Machine Learning 2
  8. Hadoop alternatives

Prerequisites

  • Introductory knowledge of statistics
  • Experience with one or more regression or classification models
  • Intermediate knowledge of Python/R

Related Courses