Big Data with Hadoop and Spark

PROGRAM OVERVIEW

BIG Data is one of the key pillars on which your datascience knowledge rests. This requires robust and strong delivery platforms like HADOOP. This module will strenthen your foundation around HADOOP and SPARK giving you a holisitc overview of HDF systems.

 

Quick Contact




CURRICULUM

Big Data with Hadoop and Spark

Hadoop 

HDFS Architecture( HDFS, YARN, MapReduce, namenode, datanode)/Cloudera
HDFS Commands
HIVE Architecture
Hive Query Language ( DDL, DML, joins, map, dictionary)
HBASE (Hbase architecture)
PIG
Assessments:- Assessment of Hadoop
Assessment of Hive

Spark   

Flume/Sqoop/ Oozie
Spark Architecture
Spark Streaming
SparkSQL/Pyspark/SparkR
Assessments:- Assessment of Spark
Assessment of Flume & Sqoop

Assignment   

04 Assignment;One Each Topics

Test Series   

01 Full Test

PROJECT & TRAINING

Our live-projects offering prepares you for a range of analytics offerings in data science domain. For this course we would work on these projects:

  1. Building Data Pipeline (Hadoop|Hive|Spark): Building a data pipeline using RDBMS, creating aggration engine in HIVE and Spark-SQL and final visualization in Tableau of the aggregated data using BFSI data

SAMPLE CERTIFICATE

 

USP OF PROGRAM

Curriculum created by the industry experts in collaboration with NASSCOM keeping in mind the industry needs
State-of-the-art infrastructure and fully equipped labs
NASSCOM SSC official study material
Training delivered by Certified and experienced trainersand industry experts
Placement Assistance.
Interview Preparation.
Certificate exam conducted by Emerging India.

FAQ

What is this program about ?

BIG data is a trending domain and also needs some specialized appraoach to handle vast data lakes. Through Hadooop and Spark you will explore how to resolve these business challenges

What is course duration ?

The course duration is approximately 100 hours

What all topics and tools will be covered in this program ?

Hadoop, Mapreduce, creating clusters and other key tools to handle big data

What other topics would this course cover ?

Some of the topics would include – HIVE, PIG, SPARK, Mapreduce, HDFS platforms like MAPR, Cloudera, Hortonworks etc